索引范围的上限是否始终假定为独占?

时间:2021-04-08 13:09:10

So in Java, whenever an indexed range is given, the upper bound is almost always exclusive.

所以在Java中,无论何时给出索引范围,上限几乎总是独占的。

From java.lang.String:

substring(int beginIndex, int endIndex)

substring(int beginIndex,int endIndex)

Returns a new string that is a substring of this string. The substring begins at the specified beginIndex and extends to the character at index endIndex - 1

返回一个新字符串,该字符串是此字符串的子字符串。子字符串从指定的beginIndex开始,并扩展到索引endIndex - 1处的字符

From java.util.Arrays:

copyOfRange(T[] original, int from, int to)

copyOfRange(T [] original,int from,int to)

from - the initial index of the range to be copied, inclusive
to - the final index of the range to be copied, exclusive.

from - 要复制的范围的初始索引,包括 - 要复制的范围的最终索引,不包括。

From java.util.BitSet:

set(int fromIndex, int toIndex)

set(int fromIndex,int toIndex)

fromIndex - index of the first bit to be set.
toIndex - index after the last bit to be set.

fromIndex - 要设置的第一个位的索引。 toIndex - 要设置的最后一位之后的索引。

As you can see, it does look like Java tries to make it a consistent convention that upper bounds are exclusive.

正如您所看到的,它看起来像Java试图使其成为上限是独占的一致约定。

My questions are:

我的问题是:

  • Is this the official authoritative recommendation?
  • 这是官方权威推荐吗?

  • Are there notable violations that we should be wary of?
  • 是否存在我们应该警惕的明显违规行为?

  • Is there a name for this system? (ala "0-based" vs "1-based")
  • 这个系统有名字吗? (ala“0-based”vs“1-based”)


CLARIFICATION: I fully understand that a collection of N objects in a 0-based system is indexed 0..N-1. My question is that if a range (2,4) given, it can be either 3 items or 2, depending on the system. What do you call these systems?

澄清:我完全理解基于0的系统中的N个对象的集合被索引为0..N-1。我的问题是,如果给定范围(2,4),它可以是3项或2,具体取决于系统。你怎么称呼这些系统?

AGAIN, the issue is not "first index 0 last index N-1" vs "first index 1 last index N" system; that's known as the 0-based vs 1-based system.

再次,问题不是“第一指数0最后指数N-1”与“第一指数1最后指数N”系统;这就是所谓的基于0和基于1的系统。

The issue is "There are 3 elements in (2,4)" vs "There are 2 elements in (2,4)" systems. What do you call these, and is one officially sanctioned over the other?

问题是“(2,4)”中有3个元素与“(2,4)”系统中有2个元素。你怎么称呼这些,并且正式批准另一个?

6 个解决方案

#1


5  

In general, yes. If you are working in a language with C-like syntax (C, C++, Java), then arrays are zero-indexed, and most random access data structures (vectors, array-lists, etc.) are going to be zero-indexed as well.

一般来说,是的。如果您使用的是具有类C语法(C,C ++,Java)的语言,那么数组是零索引的,并且大多数随机访问数据结构(向量,数组列表等)将被归零同样。

Starting indices at zero means that the size of the data structure is always going to be one greater than last valid index in the data structure. People often want to know the size of things, of course, and so it's more convenient to talk about the size than to talk about the the last valid index. People get accustomed to talking about ending indices in an exclusive fashion, because an array a[] that is n elements long has its last valid element in a[n-1].

将索引从零开始意味着数据结构的大小总是比数据结构中的最后一个有效索引大一。当然,人们通常想知道事物的大小,因此谈论大小比谈论最后一个有效索引更方便。人们习惯于以独占方式讨论结束索引,因为n元素long的数组a []在[n-1]中具有最后一个有效元素。

There is another advantage to using an exclusive index for the ending index, which is that you can compute the size of a sublist by subtracting the inclusive beginning index from the exclusive ending index. If I call myList.sublist(3, 7), then I get a sublist with 7 - 3 = 4 elements in it. If the sublist() method had used inclusive indices for both ends of the list, then I would need to add an extra 1 to compute the size of the sublist.

对结束索引使用独占索引还有另一个好处,即您可以通过从独占结束索引中减去包含性起始索引来计算子列表的大小。如果我调用myList.sublist(3,7),那么我会得到一个包含7 - 3 = 4个元素的子列表。如果sublist()方法已经为列表的两端使用了包含索引,那么我需要添加一个额外的1来计算子列表的大小。

This is particularly handy when the starting index is a variable: Getting the sublist of myList starting at i that is 5 elements long is just myList.sublist(i, i + 5).

当起始索引是一个变量时,这是特别方便的:从5个元素长的i开始获取myList的子列表只是myList.sublist(i,i + 5)。

All of that being said, you should always read the API documentation, rather than assuming that a given beginning index or ending index will be inclusive or exclusive. Likewise, you should document your own code to indicate if any bounds are inclusive or exclusive.

所有这些,你应该总是阅读API文档,而不是假设给定的开始索引或结束索引将是包容性的或排他性的。同样,您应该记录自己的代码,以指示是否包含任何边界或独占。

#2


2  

Its just 0 to n-1 based.

它仅为0到n-1。

A list/Array contains 10 items 0-9 indexed.

列表/数组包含10个项目0-9索引。

You cannot have a 0 indexed based list that is 0-n where the cout is n, that includes an item that does not exists...

你不能有一个0索引的列表,它是0-n,其中cout是n,包括一个不存在的项目...

This is the typical way things work.

这是典型的工作方式。

  1. Yes.
  2. Excel Ranges/Sheets/Workbooks.
  3. Index (information technology)
  4. 指数(信息技术)

#3


2  

Credit goes to FredOverflow in his comment saying that this is called the "half-open range". So presumably, Java Collections can be described as "0-based with half-open ranges".

在评论中称FredOverflow称这是“半开放范围”。因此,大概可以将Java Collections描述为“基于0的半开放范围”。

I've compiled some discussions about half-open vs closed ranges elsewhere:

我在其他地方编写了一些关于半开放与封闭范围的讨论:


siliconbrain.com - 16 good reasons to use half-open ranges (edited for conciseness):

siliconbrain.com - 使用半开放范围的16个充分理由(为简洁而编辑):

  • The number of elements in the range [n, m) is just m-n (and not m-n+1).
  • 范围[n,m)中的元素数量仅为m-n(而不是m-n + 1)。

  • The empty range is [n, n) (and not [n, n-1], which can be a problem if n is an iterator already pointing the first element of a list, or if n == 0).
  • 空范围是[n,n)(而不是[n,n-1],如果n是已经指向列表的第一个元素的迭代器,或者如果n == 0),则可能会出现问题。

  • For floats you can write [13, 42) (instead of [13, 41.999999999999]).
  • 对于花车你可以写[13,42](而不是[13,41.999999999999])。

  • The +1 and -1 are almost never used, when handling ranges. This is an advantage if they are expensive (as it is for dates).
  • 处理范围时,几乎从不使用+1和-1。如果它们很昂贵(就像日期一样),这是一个优势。

  • If you write a find in a range, the fact that there was nothing found can easily indicated by returning the end as the found position: if( find( [begin, end) ) == end) nothing found.
  • 如果你在一个范围内写一个find,那么找不到任何东西的事实很容易通过将结尾作为找到的位置返回来表示:if(find([begin,end))== end)找不到任何东西。

  • In languages, which start the array subscripts with 0 (like C, C++, JAVA, NCL) the upper bound is equal to the size.
  • 在语言中,使用0开始数组下标(如C,C ++,JAVA,NCL),上限等于大小。


Half-open versus closed ranges

半开放与封闭范围

Advantages of half-open ranges:

半开放范围的优点:

  • Empty ranges are valid: [0 .. 0]
  • 空范围有效:[0 .. 0]

  • Easy for subranges to go to the end of the original: [x .. $]
  • 子范围很容易转到原文的末尾:[x .. $]

  • Easy to split ranges: [0 .. x] and [x .. $]
  • 易于分割范围:[0 .. x]和[x .. $]

Advantages of closed ranges:

封闭范围的优点:

  • Symmetry.
  • Arguably easier to read.
  • 可以说更容易阅读。

  • ['a' ... 'z'] does not require awkward + 1 after 'z'.
  • ''''''z']在'z'之后不需要笨拙+ 1。

  • [0 ... uint.max] is possible.
  • [0 ... uint.max]是可能的。

That last point is very interesting. It's really awkward to write an numberIsInRange(int n, int min, int max) predicate with a half-open range if Integer.MAX_VALUE could be legally in a range.

最后一点非常有趣。如果Integer.MAX_VALUE合法地在一个范围内,那么用半开放范围写一个numberIsInRange(int n,int min,int max)谓词真的很尴尬。

#4


0  

This practice was introduced by Josh Bloch to Collections API as a contract.

这种做法由Josh Bloch作为合同引入Collections API。

After that it became a standard in java and when anybody dicide to create a public library he assumes that he should keep the contract because users expect to see already known behavior in new libraries.

之后它成了java的标准,当有人创建公共图书馆时,他认为他应该保留合同,因为用户希望看到新图书馆中已知的行为。

#5


0  

The indexes in array like datastructures are indeed always 0-based. The String is basically backed by a char[]. The Collections framework is under the hood based on arrays and so on. This makes designing/maintaining/using the API easier without changing the "under-the-hood" way to access the desired element(s) in the array.

像数据结构这样的数组中的索引确实总是从0开始。 String基本上由char []支持。 Collections框架基于数组等等。这使得设计/维护/使用API​​更容易,而无需改变“引擎盖下”方式来访问阵列中的所需元素。

There are however some "exceptions", such as the parameterindex-based setter methods of PreparedStatement and the columnindex-based getter methods of ResultSet. They are 1-based. Behind the scenes they does also not really represent an array of values.

但是有一些“例外”,例如PreparedStatement的基于parameterindex的setter方法和ResultSet的基于columnindex的getter方法。他们是1基础。在幕后,他们也没有真正代表一系列价值观。

This would probably bring up a new question: "Why are array indexes zero based?". Now, our respected computer programming scientist E.W. Dijkstra explains here why it should start with zero.

这可能会提出一个新问题:“为什么数组索引为零?”。现在,我们受人尊敬的计算机编程科学家E.W. Dijkstra在此解释了为什么它应该从零开始。

#6


0  

The easy way to think of half-open ranges is this: the first term identifies the start of elements within the range, and the second term identifies the start of elements after the range. Keep that in mind, and it all makes a good deal more sense. Plus the arithmetic works out better in many cases, per @polygenelubricants' answer.

考虑半开范围的简单方法是:第一项标识范围内元素的开始,第二项标识范围后元素的开始。记住这一点,这一切都更有意义。根据@polygenelubricants的回答,在许多情况下,算法效果更好。

#1


5  

In general, yes. If you are working in a language with C-like syntax (C, C++, Java), then arrays are zero-indexed, and most random access data structures (vectors, array-lists, etc.) are going to be zero-indexed as well.

一般来说,是的。如果您使用的是具有类C语法(C,C ++,Java)的语言,那么数组是零索引的,并且大多数随机访问数据结构(向量,数组列表等)将被归零同样。

Starting indices at zero means that the size of the data structure is always going to be one greater than last valid index in the data structure. People often want to know the size of things, of course, and so it's more convenient to talk about the size than to talk about the the last valid index. People get accustomed to talking about ending indices in an exclusive fashion, because an array a[] that is n elements long has its last valid element in a[n-1].

将索引从零开始意味着数据结构的大小总是比数据结构中的最后一个有效索引大一。当然,人们通常想知道事物的大小,因此谈论大小比谈论最后一个有效索引更方便。人们习惯于以独占方式讨论结束索引,因为n元素long的数组a []在[n-1]中具有最后一个有效元素。

There is another advantage to using an exclusive index for the ending index, which is that you can compute the size of a sublist by subtracting the inclusive beginning index from the exclusive ending index. If I call myList.sublist(3, 7), then I get a sublist with 7 - 3 = 4 elements in it. If the sublist() method had used inclusive indices for both ends of the list, then I would need to add an extra 1 to compute the size of the sublist.

对结束索引使用独占索引还有另一个好处,即您可以通过从独占结束索引中减去包含性起始索引来计算子列表的大小。如果我调用myList.sublist(3,7),那么我会得到一个包含7 - 3 = 4个元素的子列表。如果sublist()方法已经为列表的两端使用了包含索引,那么我需要添加一个额外的1来计算子列表的大小。

This is particularly handy when the starting index is a variable: Getting the sublist of myList starting at i that is 5 elements long is just myList.sublist(i, i + 5).

当起始索引是一个变量时,这是特别方便的:从5个元素长的i开始获取myList的子列表只是myList.sublist(i,i + 5)。

All of that being said, you should always read the API documentation, rather than assuming that a given beginning index or ending index will be inclusive or exclusive. Likewise, you should document your own code to indicate if any bounds are inclusive or exclusive.

所有这些,你应该总是阅读API文档,而不是假设给定的开始索引或结束索引将是包容性的或排他性的。同样,您应该记录自己的代码,以指示是否包含任何边界或独占。

#2


2  

Its just 0 to n-1 based.

它仅为0到n-1。

A list/Array contains 10 items 0-9 indexed.

列表/数组包含10个项目0-9索引。

You cannot have a 0 indexed based list that is 0-n where the cout is n, that includes an item that does not exists...

你不能有一个0索引的列表,它是0-n,其中cout是n,包括一个不存在的项目...

This is the typical way things work.

这是典型的工作方式。

  1. Yes.
  2. Excel Ranges/Sheets/Workbooks.
  3. Index (information technology)
  4. 指数(信息技术)

#3


2  

Credit goes to FredOverflow in his comment saying that this is called the "half-open range". So presumably, Java Collections can be described as "0-based with half-open ranges".

在评论中称FredOverflow称这是“半开放范围”。因此,大概可以将Java Collections描述为“基于0的半开放范围”。

I've compiled some discussions about half-open vs closed ranges elsewhere:

我在其他地方编写了一些关于半开放与封闭范围的讨论:


siliconbrain.com - 16 good reasons to use half-open ranges (edited for conciseness):

siliconbrain.com - 使用半开放范围的16个充分理由(为简洁而编辑):

  • The number of elements in the range [n, m) is just m-n (and not m-n+1).
  • 范围[n,m)中的元素数量仅为m-n(而不是m-n + 1)。

  • The empty range is [n, n) (and not [n, n-1], which can be a problem if n is an iterator already pointing the first element of a list, or if n == 0).
  • 空范围是[n,n)(而不是[n,n-1],如果n是已经指向列表的第一个元素的迭代器,或者如果n == 0),则可能会出现问题。

  • For floats you can write [13, 42) (instead of [13, 41.999999999999]).
  • 对于花车你可以写[13,42](而不是[13,41.999999999999])。

  • The +1 and -1 are almost never used, when handling ranges. This is an advantage if they are expensive (as it is for dates).
  • 处理范围时,几乎从不使用+1和-1。如果它们很昂贵(就像日期一样),这是一个优势。

  • If you write a find in a range, the fact that there was nothing found can easily indicated by returning the end as the found position: if( find( [begin, end) ) == end) nothing found.
  • 如果你在一个范围内写一个find,那么找不到任何东西的事实很容易通过将结尾作为找到的位置返回来表示:if(find([begin,end))== end)找不到任何东西。

  • In languages, which start the array subscripts with 0 (like C, C++, JAVA, NCL) the upper bound is equal to the size.
  • 在语言中,使用0开始数组下标(如C,C ++,JAVA,NCL),上限等于大小。


Half-open versus closed ranges

半开放与封闭范围

Advantages of half-open ranges:

半开放范围的优点:

  • Empty ranges are valid: [0 .. 0]
  • 空范围有效:[0 .. 0]

  • Easy for subranges to go to the end of the original: [x .. $]
  • 子范围很容易转到原文的末尾:[x .. $]

  • Easy to split ranges: [0 .. x] and [x .. $]
  • 易于分割范围:[0 .. x]和[x .. $]

Advantages of closed ranges:

封闭范围的优点:

  • Symmetry.
  • Arguably easier to read.
  • 可以说更容易阅读。

  • ['a' ... 'z'] does not require awkward + 1 after 'z'.
  • ''''''z']在'z'之后不需要笨拙+ 1。

  • [0 ... uint.max] is possible.
  • [0 ... uint.max]是可能的。

That last point is very interesting. It's really awkward to write an numberIsInRange(int n, int min, int max) predicate with a half-open range if Integer.MAX_VALUE could be legally in a range.

最后一点非常有趣。如果Integer.MAX_VALUE合法地在一个范围内,那么用半开放范围写一个numberIsInRange(int n,int min,int max)谓词真的很尴尬。

#4


0  

This practice was introduced by Josh Bloch to Collections API as a contract.

这种做法由Josh Bloch作为合同引入Collections API。

After that it became a standard in java and when anybody dicide to create a public library he assumes that he should keep the contract because users expect to see already known behavior in new libraries.

之后它成了java的标准,当有人创建公共图书馆时,他认为他应该保留合同,因为用户希望看到新图书馆中已知的行为。

#5


0  

The indexes in array like datastructures are indeed always 0-based. The String is basically backed by a char[]. The Collections framework is under the hood based on arrays and so on. This makes designing/maintaining/using the API easier without changing the "under-the-hood" way to access the desired element(s) in the array.

像数据结构这样的数组中的索引确实总是从0开始。 String基本上由char []支持。 Collections框架基于数组等等。这使得设计/维护/使用API​​更容易,而无需改变“引擎盖下”方式来访问阵列中的所需元素。

There are however some "exceptions", such as the parameterindex-based setter methods of PreparedStatement and the columnindex-based getter methods of ResultSet. They are 1-based. Behind the scenes they does also not really represent an array of values.

但是有一些“例外”,例如PreparedStatement的基于parameterindex的setter方法和ResultSet的基于columnindex的getter方法。他们是1基础。在幕后,他们也没有真正代表一系列价值观。

This would probably bring up a new question: "Why are array indexes zero based?". Now, our respected computer programming scientist E.W. Dijkstra explains here why it should start with zero.

这可能会提出一个新问题:“为什么数组索引为零?”。现在,我们受人尊敬的计算机编程科学家E.W. Dijkstra在此解释了为什么它应该从零开始。

#6


0  

The easy way to think of half-open ranges is this: the first term identifies the start of elements within the range, and the second term identifies the start of elements after the range. Keep that in mind, and it all makes a good deal more sense. Plus the arithmetic works out better in many cases, per @polygenelubricants' answer.

考虑半开范围的简单方法是:第一项标识范围内元素的开始,第二项标识范围后元素的开始。记住这一点,这一切都更有意义。根据@polygenelubricants的回答,在许多情况下,算法效果更好。