一个关于截取字符串的面试题！

编程：编写一个截取字符串的函数，输入一个字符串和字节数，输出按字节书截取的字符串，但是要保证汉字不能截半个，，如“我ABC”，4 应该截取“我AB”，输入“我ABC汉DEF”，6 然后输出“我ABC”，而不是半个汗字；
---------------------------------
周五下午做的面试题，回来一直在想！大家帮我看看，顺便带上方法注释！

197 个解决方案

#1

用 byte数组方式截取的时候判断截的位置是否在
GBK编码格式的中文区域内这样?

好象有点笨效率有问题
我去写写看!!!

#2

你要截取的是"字节"，那应该会出乱码吧

#3

转换

#4

引用 2 楼 heavilyarmed 的回复:

你要截取的是 "字节 "，那应该会出乱码吧

截半个汉字会有乱码这种情况下舍掉半个汉字

#5

不错的帖子，想想

#6

比较困难。

#7

mark下

#8

我也遇到过同样的面试题，真是麻烦，只想到了stringStream的方法，按字符数截取，按字节的自己搞不懂，关注

#9

判断最后取的2个字节是否属于一个汉字？先标记帖子

#10

想想

#11

等待答案，学习学习

#12

我在别的地方找了一段代码
public static String substring(String str, int toCount,String more)
{
int reInt = 0;
String reStr = "";
if (str == null)
return "";
char[] tempChar = str.toCharArray();
for (int kk = 0; (kk < tempChar.length && toCount > reInt); kk++)
{
String s1 = str.valueOf(tempChar[kk]);
System.out.print(s1);
byte[] b = s1.getBytes();
reInt += b.length;
reStr += tempChar[kk];
}
if (toCount == reInt || (toCount == reInt - 1))
reStr += more;
return reStr;
}

#13

引用 12 楼 heavilyarmed 的回复:

我在别的地方找了一段代码
public static String substring(String str, int toCount,String more)
{
int reInt = 0;
String reStr = "";
if (str == null)
return "";
char[] tempChar = str.toCharArray();
for (int kk = 0; (kk < tempChar.length && toCount > reInt); kk++)
{
String s1 = str.valueOf(tempChar[kk]);
System.out.print(s1);
byte[] b = s1.getBytes();
reInt += b.length;
reStr += tempCh…

貌似有点麻烦！

#14

真不好意思，我是初学的，水平有限

#15

有点意思

#16

网上能搜到

#17

一个一个截截好判断是否是数字或E文

如果是就 append 如果不是就截2个

我是这么个思路！

#18

百度Ｎ　　多回答的。

#19

public static string SubstringByByte(string str, int byteLength)
        {
           char[] strs = str.ToCharArray();
           string strings = null;
           if (byteLength == 0)
               return strings;
           foreach (char temp in strs)
           {
               byte[] bytes = Encoding.UTF8.GetBytes(temp.ToString());
               strings += temp.ToString();
               byteLength = byteLength - bytes.Length;
               if (byteLength <= 0)
                   break;
           }
           return strings;
        }
虽然有点麻烦，不过好像能行

#20

谁有更好的方案学习学习

#21

載字節時應該出現亂碼!!

mark

#22

//用C#实现一个：

        static string GetSubString(string str, int byteCount)

        {

            int count = 0;

            string result = string.Empty;

            foreach (char ch in str)

            {

                count += System.Text.Encoding.Default.GetByteCount(ch.ToString());

                if (count > byteCount) break;

                result += ch.ToString();

            }

            return result;

        }


        static void Main(string[] args)//调用

        {

            string str = "我ABC汉DEF";

            for (int i = 1; i < 10; i++)

            {

                Console.WriteLine("截出"+i+"个字节：");

                Console.WriteLine(GetSubString(str, i));

            }

        }


/*输出结果：

截出1个字节：


截出2个字节：

我

截出3个字节：

我A

截出4个字节：

我AB

截出5个字节：

我ABC

截出6个字节：

我ABC

截出7个字节：

我ABC汉

截出8个字节：

我ABC汉D

截出9个字节：

我ABC汉DE


*/

#23

asdfsadfasdfasdf

#24

不太明白你说的。。。

#25

顶！

#26

引用

顶！

#27

public class Test {

/**
* @param args
*/
public void output(int count,String str)
{
int index=0;
boolean flag=false;

for(int i=0;i<str.length()&& index<count;i++)
{
char c=str.charAt(i);
if(c>=0 && c<=255)
{
flag=true;
index++;
System.out.println(c);
}
if(flag==false)
{
index+=2;
if(index<=count)System.out.println(c);
}
flag=false;

}
}

public static void main(String[] args) {
Test t=new Test();
t.output(6,"ab@毕AKDJSD");

}

}

#28

JDK1.5以上直接substring就可以了，不会半个汉字的。

#29

要好好学习。我是刚学习的。向你们学习.

#30

mark

#31

mark

#32

sdsdfsdfsdf

#33

- - !!!!!!!!!!!!

#34

学习！

#35

学习中！

#36

public String getResult(String input,int num)
{
  int max=getBytes(input).length;
  //判断特殊情况
   if(num<=0||num>=max)return input;
   //重原始字符串得到字符数组
   char [] cs = input.toCharArray();

   //要得到固定字节的数组
   char [] rs = new char[num];
   for(int i=0;i<num;i++)
   {
     //判断字符是否汉字,是最后一次循环并且是汉字，直接退出循环;如果是汉字又不是最后一次循环，字节数num减一，相当与少循环一次，少个字符
      if(cs[i].toString().getBytes().length!= cs[i].toString().length &&(i==num-1))
         break;
      else if(cs[i].toString().getBytes().length!= cs[i].toString().length &&(i<num-1))
         num--;
      rs[i]=cs[i];
   }
   return new String(rs);
}

#37

解释的不是很清楚，我自己写的，验证通过，没问题



	/**

	 * 逐一的验证子串，得到获得临界的那个的位置 index

	 * 

	 * @param s

	 * @param b

	 * @return

	 */

	public static String sss(String s, int b) {

		int byteNum = b;// 记录要的字节数

		String sub = "";// 保存子串

		int index = 1;// 用于记录字符串的长度，比如： 我AB 长度是3，而不是字节数4

		for (int i = 1; i <= s.toCharArray().length

				&& byteNum - sub.getBytes().length > 0; i++) {

			sub = s.substring(0, i);

			index = i - 1;

		}

		if (byteNum - sub.getBytes().length == 0) {// 如果正好满足临界条件，就直接返回sub

			return sub;

		} else {

			return s.substring(0, index);// 如果不满足，就减少一个字符（i-1），确保比限定的字节小

		}

	}

#38

截取字节和字符似乎没有什么难度，唯一的难度其实就是判断接触去的是个什么东西了。。

给出我的方式吧，自己感觉是最方便的

写个方法,方法中代码如下，功能就是判断是字符还是非字符，然后调用这个判断下就可以了

if (sChar.CompareTo("a") >= 0 && sChar.CompareTo("z") <= 0)
bReturn = true;
if (sChar.CompareTo("A") >= 0 && sChar.CompareTo("Z") <= 0)
bReturn = true;
else
bReturn = false;

#39

学习了。

#40

引用 22 楼 min_jie 的回复:

C# code
//用C#实现一个：
        static string GetSubString(string str, int byteCount)
        {
            int count = 0;
            string result = string.Empty;
            foreach (char ch in str)
            {
                count += System.Text.Encoding.Default.GetByteCount(ch.ToString());
                if (count > byteCount) break;
                result += ch.ToString();
           …

这个不错！
MARKED BY CNDO

#41

我没明白你的意思

#42

回帖是一种美德！每天回帖即可获得 10 分可用分！

#43

有点意思，我也做做看

#44

class CopyStrByByte{

  private String str = "";   //字符串

  private int copyNum = 0;   //要复制的字节数

  private String arrStr[];   //存放将字符串拆分成的字符数组

  private int cutNum = 0;  //已截取的字节数

  private int cc = 0;   //str中的中文字符数

  

  public CopyStrByByte(String str,int copyNum){

  	this.str = str;

  	this.copyNum = copyNum;

  }

  public String CopyStr(){

  	arrStr = str.split(""); //将传的字符串拆分为字符数组

  	str = "";   // 清空，用于存放已截取的字符

    for (int i = 0;i < arrStr.length;i++){

    	if (arrStr[i].getBytes().length == 1){  // 非汉字

    		cutNum = cutNum + 1;  

    		str = str + arrStr[i];

    	} else if (arrStr[i].getBytes().length == 2) {   //汉字

    		cc = cc + 1;

    		cutNum = cutNum + 2;

    		str = str + arrStr[i];

    	}

    	if (cutNum >= copyNum) break;  //已截取的字符数大于或等于要截取的字符数

    }

    if (cutNum > copyNum){	//已截取的字符数大于要截取的字符数

      return str.substring(0, copyNum - cc);

    } else {

    	return str;

    }

  }

}

public class TestCopyStr{

	public static void main(String args[]){

		CopyStrByByte cp = new CopyStrByByte("as论者afs为什么",12);

		System.out.println(cp.CopyStr());

	}

}

#45

回帖是一种美德！每天回帖即可获得 10 分可用分！

#46

引用 44 楼 witeye 的回复:

Java codeclass CopyStrByByte{
  private String str = "";   //字符串
  private int copyNum = 0;   //要复制的字节数
  private String arrStr[];   //存放将字符串拆分成的字符数组
  private int cutNum = 0;  //已截取的字节数
  private int cc = 0;   //str中的中文字符数

  public CopyStrByByte(String str,int copyNum){
      this.str = str;
      this.copyNum = copyNum;
  }
  public String Co…

我很久以前做的，代码还可以优化一下;)

#47

考虑

#48

public static string SubstringByByte(string str, int byteLength)
        {
          char[] strs = str.ToCharArray();
          string strings = null;
          if (byteLength == 0)
              return strings;
          foreach (char temp in strs)
          {
              byte[] bytes = Encoding.UTF8.GetBytes(temp.ToString());
              strings += temp.ToString();
              byteLength = byteLength - bytes.Length;
              if (byteLength <= 0)
                  break;
          }
          return strings;
        }

#49



package lihan; 


/** 

 *  

 *  

 * 关于java按字节截取带有汉字的字符串的解法 

 * @author 李晗 

 * 

 */ 


public class test{  


    public void splitIt(String splitStr, int bytes) {  

    int cutLength = 0;  

    int byteNum = bytes;  

    byte bt[] = splitStr.getBytes();  

    System.out.println("Length of this String ===>" + bt.length);  

    if (bytes > 1) {  

    for (int i = 0; i < byteNum; i++) {  

    if (bt[i] < 0) {  

    cutLength++;  


    }  

    }  


    if (cutLength % 2 == 0) {  

    cutLength /= 2;  

    }else  

    {  

    cutLength=0;  

    }  

    }  

    int result=cutLength+--byteNum;  

    if(result>bytes)  

    {  

    result=bytes;  

    }  

    if (bytes == 1) {  

    if (bt[0] < 0) {  

    result+=2;  


    }else  

    {  

    result+=1;  

    }  

    }  

    String substrx = new String(bt, 0, result);  

    System.out.println(substrx);  


    }  


    public static void main(String args[]) {  

    String str = "我abc的DEFe呀fgsdfg大撒旦";  

    int num =3;  

    System.out.println("num:" + num);  

    test sptstr = new test();  

    sptstr.splitIt(str, num);  

    }  


    }

#50

不错，收藏了。

#1

用 byte数组方式截取的时候判断截的位置是否在
GBK编码格式的中文区域内这样?

好象有点笨效率有问题
我去写写看!!!

#2

你要截取的是"字节"，那应该会出乱码吧

#3

转换

#4

引用 2 楼 heavilyarmed 的回复:

你要截取的是 "字节 "，那应该会出乱码吧

截半个汉字会有乱码这种情况下舍掉半个汉字

#5

不错的帖子，想想

#6

比较困难。

#7

mark下

#8

我也遇到过同样的面试题，真是麻烦，只想到了stringStream的方法，按字符数截取，按字节的自己搞不懂，关注

#9

判断最后取的2个字节是否属于一个汉字？先标记帖子

#10

想想

#11

等待答案，学习学习

#12

#13

引用 12 楼 heavilyarmed 的回复:

我在别的地方找了一段代码
public static String substring(String str, int toCount,String more)
{
int reInt = 0;
String reStr = "";
if (str == null)
return "";
char[] tempChar = str.toCharArray();
for (int kk = 0; (kk < tempChar.length && toCount > reInt); kk++)
{
String s1 = str.valueOf(tempChar[kk]);
System.out.print(s1);
byte[] b = s1.getBytes();
reInt += b.length;
reStr += tempCh…

貌似有点麻烦！

#14

真不好意思，我是初学的，水平有限

#15

有点意思

#16

网上能搜到

#17

一个一个截截好判断是否是数字或E文

如果是就 append 如果不是就截2个

我是这么个思路！

#18

百度Ｎ　　多回答的。

#19

#20

谁有更好的方案学习学习

#21

載字節時應該出現亂碼!!

mark

#22

//用C#实现一个：

        static string GetSubString(string str, int byteCount)

        {

            int count = 0;

            string result = string.Empty;

            foreach (char ch in str)

            {

                count += System.Text.Encoding.Default.GetByteCount(ch.ToString());

                if (count > byteCount) break;

                result += ch.ToString();

            }

            return result;

        }


        static void Main(string[] args)//调用

        {

            string str = "我ABC汉DEF";

            for (int i = 1; i < 10; i++)

            {

                Console.WriteLine("截出"+i+"个字节：");

                Console.WriteLine(GetSubString(str, i));

            }

        }


/*输出结果：

截出1个字节：


截出2个字节：

我

截出3个字节：

我A

截出4个字节：

我AB

截出5个字节：

我ABC

截出6个字节：

我ABC

截出7个字节：

我ABC汉

截出8个字节：

我ABC汉D

截出9个字节：

我ABC汉DE


*/

#23

asdfsadfasdfasdf

#24

不太明白你说的。。。

#25

顶！

#26

引用

顶！

#27

#28

JDK1.5以上直接substring就可以了，不会半个汉字的。

#29

要好好学习。我是刚学习的。向你们学习.

#30

mark

#31

mark

#32

sdsdfsdfsdf

#33

- - !!!!!!!!!!!!

#34

学习！

#35

学习中！

#36

#37

解释的不是很清楚，我自己写的，验证通过，没问题



	/**

	 * 逐一的验证子串，得到获得临界的那个的位置 index

	 * 

	 * @param s

	 * @param b

	 * @return

	 */

	public static String sss(String s, int b) {

		int byteNum = b;// 记录要的字节数

		String sub = "";// 保存子串

		int index = 1;// 用于记录字符串的长度，比如： 我AB 长度是3，而不是字节数4

		for (int i = 1; i <= s.toCharArray().length

				&& byteNum - sub.getBytes().length > 0; i++) {

			sub = s.substring(0, i);

			index = i - 1;

		}

		if (byteNum - sub.getBytes().length == 0) {// 如果正好满足临界条件，就直接返回sub

			return sub;

		} else {

			return s.substring(0, index);// 如果不满足，就减少一个字符（i-1），确保比限定的字节小

		}

	}

#38

#39

学习了。

#40

引用 22 楼 min_jie 的回复:

C# code
//用C#实现一个：
        static string GetSubString(string str, int byteCount)
        {
            int count = 0;
            string result = string.Empty;
            foreach (char ch in str)
            {
                count += System.Text.Encoding.Default.GetByteCount(ch.ToString());
                if (count > byteCount) break;
                result += ch.ToString();
           …

这个不错！
MARKED BY CNDO

#41

我没明白你的意思

#42

回帖是一种美德！每天回帖即可获得 10 分可用分！

#43

有点意思，我也做做看

#44

class CopyStrByByte{

  private String str = "";   //字符串

  private int copyNum = 0;   //要复制的字节数

  private String arrStr[];   //存放将字符串拆分成的字符数组

  private int cutNum = 0;  //已截取的字节数

  private int cc = 0;   //str中的中文字符数

  

  public CopyStrByByte(String str,int copyNum){

  	this.str = str;

  	this.copyNum = copyNum;

  }

  public String CopyStr(){

  	arrStr = str.split(""); //将传的字符串拆分为字符数组

  	str = "";   // 清空，用于存放已截取的字符

    for (int i = 0;i < arrStr.length;i++){

    	if (arrStr[i].getBytes().length == 1){  // 非汉字

    		cutNum = cutNum + 1;  

    		str = str + arrStr[i];

    	} else if (arrStr[i].getBytes().length == 2) {   //汉字

    		cc = cc + 1;

    		cutNum = cutNum + 2;

    		str = str + arrStr[i];

    	}

    	if (cutNum >= copyNum) break;  //已截取的字符数大于或等于要截取的字符数

    }

    if (cutNum > copyNum){	//已截取的字符数大于要截取的字符数

      return str.substring(0, copyNum - cc);

    } else {

    	return str;

    }

  }

}

public class TestCopyStr{

	public static void main(String args[]){

		CopyStrByByte cp = new CopyStrByByte("as论者afs为什么",12);

		System.out.println(cp.CopyStr());

	}

}

#45

回帖是一种美德！每天回帖即可获得 10 分可用分！

#46

引用 44 楼 witeye 的回复:

Java codeclass CopyStrByByte{
  private String str = "";   //字符串
  private int copyNum = 0;   //要复制的字节数
  private String arrStr[];   //存放将字符串拆分成的字符数组
  private int cutNum = 0;  //已截取的字节数
  private int cc = 0;   //str中的中文字符数

  public CopyStrByByte(String str,int copyNum){
      this.str = str;
      this.copyNum = copyNum;
  }
  public String Co…

我很久以前做的，代码还可以优化一下;)

#47

考虑

#48

public static string SubstringByByte(string str, int byteLength)
        {
          char[] strs = str.ToCharArray();
          string strings = null;
          if (byteLength == 0)
              return strings;
          foreach (char temp in strs)
          {
              byte[] bytes = Encoding.UTF8.GetBytes(temp.ToString());
              strings += temp.ToString();
              byteLength = byteLength - bytes.Length;
              if (byteLength <= 0)
                  break;
          }
          return strings;
        }

#49



package lihan; 


/** 

 *  

 *  

 * 关于java按字节截取带有汉字的字符串的解法 

 * @author 李晗 

 * 

 */ 


public class test{  


    public void splitIt(String splitStr, int bytes) {  

    int cutLength = 0;  

    int byteNum = bytes;  

    byte bt[] = splitStr.getBytes();  

    System.out.println("Length of this String ===>" + bt.length);  

    if (bytes > 1) {  

    for (int i = 0; i < byteNum; i++) {  

    if (bt[i] < 0) {  

    cutLength++;  


    }  

    }  


    if (cutLength % 2 == 0) {  

    cutLength /= 2;  

    }else  

    {  

    cutLength=0;  

    }  

    }  

    int result=cutLength+--byteNum;  

    if(result>bytes)  

    {  

    result=bytes;  

    }  

    if (bytes == 1) {  

    if (bt[0] < 0) {  

    result+=2;  


    }else  

    {  

    result+=1;  

    }  

    }  

    String substrx = new String(bt, 0, result);  

    System.out.println(substrx);  


    }  


    public static void main(String args[]) {  

    String str = "我abc的DEFe呀fgsdfg大撒旦";  

    int num =3;  

    System.out.println("num:" + num);  

    test sptstr = new test();  

    sptstr.splitIt(str, num);  

    }  


    }

#50

不错，收藏了。