如何将字符串拆分为字符数组? [重复]

时间:2021-08-14 21:36:41

This question already has an answer here:

这个问题在这里已有答案:

var s = "overpopulation";
var ar = [];
ar = s.split();
alert(ar);

I want to string.split a word into array of characters.

我想把string.split一个单词变成一个字符数组。

The above code doesn't seem to work - it returns "overpopulation" as Object..

上面的代码似乎不起作用 - 它将“overpopulation”作为Object返回..

How do i split it into array of characters, if original string doesn't contain commas and whitespace?

如果原始字符串不包含逗号和空格,我如何将其拆分为字符数组?

8 个解决方案

#1


180  

You can split on an empty string:

您可以拆分空字符串:

var chars = "overpopulation".split('');

If you just want to access a string in an array-like fashion, you can do that without split:

如果您只想以类似数组的方式访问字符串,则可以在不拆分的情况下执行此操作:

var s = "overpopulation";
for (var i = 0; i < s.length; i++) {
    console.log(s.charAt(i));
}

You can also access each character with its index using normal array syntax. Note, however, that strings are immutable, which means you can't set the value of a character using this method, and that it isn't supported by IE7 (if that still matters to you).

您还可以使用常规数组语法访问每个字符及其索引。但请注意,字符串是不可变的,这意味着您无法使用此方法设置字符的值,并且IE7不支持它(如果这仍然对您很重要)。

var s = "overpopulation";

console.log(s[3]); // logs 'r'

#2


45  

Old question but I should warn:

老问题,但我应该警告:

As noted, if your purpose is to access characters by an index, you can use str[index] (ES5) or str.charAt(index) and don't need a conversion.

如上所述,如果您的目的是通过索引访问字符,则可以使用str [index](ES5)或str.charAt(index),而不需要转换。

EDIT: If you care, str[index] (ES5) and str.charAt(index) will also return weird results with non-BMP charsets. e.g. '????'.charAt(0) returns "�".

编辑:如果你关心,str [index](ES5)和str.charAt(index)也将返回非BMP字符集的奇怪结果。例如'????'.charAt(0)返回“ ”。


Do NOT use .split('')

You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.

使用非BMP(非基本多语言平面)字符集会得到奇怪的结果。

Reason is that methods like .split() and .charCodeAt() only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.

原因是像.split()和.charCodeAt()这样的方法只尊重代码点低于65536的字符; BEC。更高的代码点由一对(较低值)“代理”伪字符表示。

'????????????'.length     // —> 6
'????????????'.split('')  // —> ["�", "�", "�", "�", "�", "�"]

'????'.length      // —> 2
'????'.split('')   // —> ["�", "�"]

Use ES2015 (ES6) features where possible:

Using the spread operator:

使用扩展运算符:

let arr = [...str];

Or Array.from

或者Array.from

let arr = Array.from(str);

Or split with the new u RegExp flag:

或使用新的u RegExp标志拆分:

let arr = str.split(/(?!$)/u;

Examples:

例子:

[...'????????????']        // —> ["????", "????", "????"]
[...'????????????']     // —> ["????", "????", "????"]

For ES5, options are limited:

I came up with this function that internally uses MDN example to get the correct code point of each character.

我想出了这个函数,它在内部使用MDN示例来获取每个字符的正确代码点。

function stringToArray() {
  var i = 0,
    arr = [],
    codePoint;
  while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
    arr.push(String.fromCodePoint(codePoint));
    i++;
  }
  return arr;
}

This requires knownCharCodeAt() function and for some browsers; a String.fromCodePoint() polyfill.

这需要knownCharCodeAt()函数和某些浏览器; String.fromCodePoint()polyfill。

if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , © 2012 Steven Levithan , MIT License
    String.fromCodePoint = function fromCodePoint () {
        var chars = [], point, offset, units, i;
        for (i = 0; i < arguments.length; ++i) {
            point = arguments[i];
            offset = point - 0x10000;
            units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
            chars.push(String.fromCharCode.apply(null, units));
        }
        return chars.join("");
    }
}

Examples:

例子:

stringToArray('????????????')     // —> ["????", "????", "????"]
stringToArray('????????????')  // —> ["????", "????", "????"]

UPDATE: Read this nice article about JS and unicode.

更新:阅读这篇关于JS和unicode的好文章。

#3


19  

It's as simple as:

它很简单:

s.split("");

The delimiter is an empty string, hence it will break up between each single character.

分隔符是一个空字符串,因此它将在每个单个字符之间分解。

#4


9  

The split() method in javascript accepts two parameters: a separator and a limit. The separator specifies the character to use for splitting the string. If you don't specify a separator, the entire string is returned, non-separated. But, if you specify the empty string as a separator, the string is split between each character.

javascript中的split()方法接受两个参数:分隔符和限制。分隔符指定用于拆分字符串的字符。如果未指定分隔符,则返回整个字符串,不分隔。但是,如果将空字符串指定为分隔符,则在每个字符之间拆分字符串。

Therefore:

因此:

s.split('')

will have the effect you seek.

会有你想要的效果。

More information here

更多信息在这里

#5


5  

You can use the regular expression /(?!$)/:

你可以使用正则表达式/(?!$)/:

"overpopulation".split(/(?!$)/)

The negative look-ahead assertion (?!$) will match right in front of every character.

负向前瞻断言(?!$)将在每个角色前面匹配。

#6


4  

A string in Javascript is already a character array.

Javascript中的字符串已经是一个字符数组。

You can simply access any character in the array as you would any other array.

您可以像访问任何其他数组一样简单地访问数组中的任何字符。

var s = "overpopulation";
alert(s[0]) // alerts o.

UPDATE

UPDATE

As is pointed out in the comments below, the above method for accessing a character in a string is part of ECMAScript 5 which certain browsers may not conform to.

正如下面的评论中所指出的,上述用于访问字符串中的字符的方法是ECMAScript 5的一部分,某些浏览器可能不符合这些方法。

An alternative method you can use is charAt(index).

您可以使用的另一种方法是charAt(index)。

var s = "overpopulation";
    alert(s.charAt(0)) // alerts o.

#7


1  

To support emojis use this

为了支持表情符号,请使用此功能

('Dragon ????').split(/(?!$)/u);

=> ['D', 'r', 'a', 'g', 'o', 'n', ' ', '????']

=> ['D','r','a','g','o','n','','????']

#8


1  

.split('') would split emojis in half.

.split('')将表情符号分成两半。

Onur's solutions and the regex's proposed work for some emojis, but can't handle more complex languages or combined emojis. Consider this emoji being ruined:

Onur的解决方案和正则表达式为一些表情符号提出的工作,但无法处理更复杂的语言或组合表情符号。考虑这个表情符号被破坏:

[..."????️‍????"] // returns ["????", "️", "‍", "????"]  instead of ["????️‍????"]

Also consider this Hindi text "अनुच्छेद" which is split like this:

还要考虑这个印地语文本“अनुच्छेद”,它是这样分开的:

[..."अनुच्छेद"]  // returns   ["अ", "न", "ु", "च", "्", "छ", "े", "द"]

but should in fact be split like this:

但事实上应该像这样分开:

["अ","नु","च्","छे","द"]

because some of the characters are combining marks (think diacritics/accents in European languages).

因为有些角色是组合标记(想想欧洲语言中的变音符号/重音符号)。

You can use the grapheme-splitter library for this:

您可以使用字形分割器库:

https://github.com/orling/grapheme-splitter

https://github.com/orling/grapheme-splitter

It does proper standards-based letter split in all the hundreds of exotic edge-cases - yes, there are that many.

它在所有数百种奇特的边缘案例中都有适当的基于标准的字母 - 是的,有很多。

#1


180  

You can split on an empty string:

您可以拆分空字符串:

var chars = "overpopulation".split('');

If you just want to access a string in an array-like fashion, you can do that without split:

如果您只想以类似数组的方式访问字符串,则可以在不拆分的情况下执行此操作:

var s = "overpopulation";
for (var i = 0; i < s.length; i++) {
    console.log(s.charAt(i));
}

You can also access each character with its index using normal array syntax. Note, however, that strings are immutable, which means you can't set the value of a character using this method, and that it isn't supported by IE7 (if that still matters to you).

您还可以使用常规数组语法访问每个字符及其索引。但请注意,字符串是不可变的,这意味着您无法使用此方法设置字符的值,并且IE7不支持它(如果这仍然对您很重要)。

var s = "overpopulation";

console.log(s[3]); // logs 'r'

#2


45  

Old question but I should warn:

老问题,但我应该警告:

As noted, if your purpose is to access characters by an index, you can use str[index] (ES5) or str.charAt(index) and don't need a conversion.

如上所述,如果您的目的是通过索引访问字符,则可以使用str [index](ES5)或str.charAt(index),而不需要转换。

EDIT: If you care, str[index] (ES5) and str.charAt(index) will also return weird results with non-BMP charsets. e.g. '????'.charAt(0) returns "�".

编辑:如果你关心,str [index](ES5)和str.charAt(index)也将返回非BMP字符集的奇怪结果。例如'????'.charAt(0)返回“ ”。


Do NOT use .split('')

You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.

使用非BMP(非基本多语言平面)字符集会得到奇怪的结果。

Reason is that methods like .split() and .charCodeAt() only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.

原因是像.split()和.charCodeAt()这样的方法只尊重代码点低于65536的字符; BEC。更高的代码点由一对(较低值)“代理”伪字符表示。

'????????????'.length     // —> 6
'????????????'.split('')  // —> ["�", "�", "�", "�", "�", "�"]

'????'.length      // —> 2
'????'.split('')   // —> ["�", "�"]

Use ES2015 (ES6) features where possible:

Using the spread operator:

使用扩展运算符:

let arr = [...str];

Or Array.from

或者Array.from

let arr = Array.from(str);

Or split with the new u RegExp flag:

或使用新的u RegExp标志拆分:

let arr = str.split(/(?!$)/u;

Examples:

例子:

[...'????????????']        // —> ["????", "????", "????"]
[...'????????????']     // —> ["????", "????", "????"]

For ES5, options are limited:

I came up with this function that internally uses MDN example to get the correct code point of each character.

我想出了这个函数,它在内部使用MDN示例来获取每个字符的正确代码点。

function stringToArray() {
  var i = 0,
    arr = [],
    codePoint;
  while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
    arr.push(String.fromCodePoint(codePoint));
    i++;
  }
  return arr;
}

This requires knownCharCodeAt() function and for some browsers; a String.fromCodePoint() polyfill.

这需要knownCharCodeAt()函数和某些浏览器; String.fromCodePoint()polyfill。

if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , © 2012 Steven Levithan , MIT License
    String.fromCodePoint = function fromCodePoint () {
        var chars = [], point, offset, units, i;
        for (i = 0; i < arguments.length; ++i) {
            point = arguments[i];
            offset = point - 0x10000;
            units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
            chars.push(String.fromCharCode.apply(null, units));
        }
        return chars.join("");
    }
}

Examples:

例子:

stringToArray('????????????')     // —> ["????", "????", "????"]
stringToArray('????????????')  // —> ["????", "????", "????"]

UPDATE: Read this nice article about JS and unicode.

更新:阅读这篇关于JS和unicode的好文章。

#3


19  

It's as simple as:

它很简单:

s.split("");

The delimiter is an empty string, hence it will break up between each single character.

分隔符是一个空字符串,因此它将在每个单个字符之间分解。

#4


9  

The split() method in javascript accepts two parameters: a separator and a limit. The separator specifies the character to use for splitting the string. If you don't specify a separator, the entire string is returned, non-separated. But, if you specify the empty string as a separator, the string is split between each character.

javascript中的split()方法接受两个参数:分隔符和限制。分隔符指定用于拆分字符串的字符。如果未指定分隔符,则返回整个字符串,不分隔。但是,如果将空字符串指定为分隔符,则在每个字符之间拆分字符串。

Therefore:

因此:

s.split('')

will have the effect you seek.

会有你想要的效果。

More information here

更多信息在这里

#5


5  

You can use the regular expression /(?!$)/:

你可以使用正则表达式/(?!$)/:

"overpopulation".split(/(?!$)/)

The negative look-ahead assertion (?!$) will match right in front of every character.

负向前瞻断言(?!$)将在每个角色前面匹配。

#6


4  

A string in Javascript is already a character array.

Javascript中的字符串已经是一个字符数组。

You can simply access any character in the array as you would any other array.

您可以像访问任何其他数组一样简单地访问数组中的任何字符。

var s = "overpopulation";
alert(s[0]) // alerts o.

UPDATE

UPDATE

As is pointed out in the comments below, the above method for accessing a character in a string is part of ECMAScript 5 which certain browsers may not conform to.

正如下面的评论中所指出的,上述用于访问字符串中的字符的方法是ECMAScript 5的一部分,某些浏览器可能不符合这些方法。

An alternative method you can use is charAt(index).

您可以使用的另一种方法是charAt(index)。

var s = "overpopulation";
    alert(s.charAt(0)) // alerts o.

#7


1  

To support emojis use this

为了支持表情符号,请使用此功能

('Dragon ????').split(/(?!$)/u);

=> ['D', 'r', 'a', 'g', 'o', 'n', ' ', '????']

=> ['D','r','a','g','o','n','','????']

#8


1  

.split('') would split emojis in half.

.split('')将表情符号分成两半。

Onur's solutions and the regex's proposed work for some emojis, but can't handle more complex languages or combined emojis. Consider this emoji being ruined:

Onur的解决方案和正则表达式为一些表情符号提出的工作,但无法处理更复杂的语言或组合表情符号。考虑这个表情符号被破坏:

[..."????️‍????"] // returns ["????", "️", "‍", "????"]  instead of ["????️‍????"]

Also consider this Hindi text "अनुच्छेद" which is split like this:

还要考虑这个印地语文本“अनुच्छेद”,它是这样分开的:

[..."अनुच्छेद"]  // returns   ["अ", "न", "ु", "च", "्", "छ", "े", "द"]

but should in fact be split like this:

但事实上应该像这样分开:

["अ","नु","च्","छे","द"]

because some of the characters are combining marks (think diacritics/accents in European languages).

因为有些角色是组合标记(想想欧洲语言中的变音符号/重音符号)。

You can use the grapheme-splitter library for this:

您可以使用字形分割器库:

https://github.com/orling/grapheme-splitter

https://github.com/orling/grapheme-splitter

It does proper standards-based letter split in all the hundreds of exotic edge-cases - yes, there are that many.

它在所有数百种奇特的边缘案例中都有适当的基于标准的字母 - 是的,有很多。