/*
* Copyright (c) 1994, 2019, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation. Oracle designates this
* particular file as subject to the "Classpath" exception as provided
* by Oracle in the LICENSE file that accompanied this code.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/
package java.lang;
import java.io.ObjectStreamField;
import java.io.UnsupportedEncodingException;
import java.lang.annotation.Native;
import java.lang.invoke.MethodHandles;
import java.lang.constant.Constable;
import java.lang.constant.ConstantDesc;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Comparator;
import java.util.Formatter;
import java.util.List;
import java.util.Locale;
import java.util.Objects;
import java.util.Optional;
import java.util.Spliterator;
import java.util.StringJoiner;
import java.util.function.Function;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
import jdk.internal.HotSpotIntrinsicCandidate;
import jdk.internal.vm.annotation.Stable;
import static java.util.function.Predicate.not;
The String
class represents character strings. All string literals in Java programs, such as "abc"
, are implemented as instances of this class.
Strings are constant; their values cannot be changed after they
are created. String buffers support mutable strings.
Because String objects are immutable they can be shared. For example:
String str = "abc";
is equivalent to:
char data[] = {'a', 'b', 'c'};
String str = new String(data);
Here are some more examples of how strings can be used:
System.out.println("abc");
String cde = "cde";
System.out.println("abc" + cde);
String c = "abc".substring(2, 3);
String d = cde.substring(1, 2);
The class String
includes methods for examining individual characters of the sequence, for comparing strings, for searching strings, for extracting substrings, and for creating a copy of a string with all characters translated to uppercase or to lowercase. Case mapping is based on the Unicode Standard version specified by the Character
class.
The Java language provides special support for the string
concatenation operator ( + ), and for conversion of
other objects to strings. For additional information on string
concatenation and conversion, see The Java™ Language Specification.
Unless otherwise noted, passing a null
argument to a constructor or method in this class will cause a NullPointerException
to be thrown.
A String
represents a string in the UTF-16 format in which supplementary characters are represented by surrogate
pairs (see the section Unicode
Character Representations in the Character
class for more information). Index values refer to char
code units, so a supplementary character uses two positions in a String
.
The String
class provides methods for dealing with Unicode code points (i.e., characters), in addition to those for dealing with Unicode code units (i.e., char
values).
Unless otherwise noted, methods for comparing Strings do not take locale into account. The Collator
class provides methods for finer-grain, locale-sensitive String comparison.
Author: Lee Boynton, Arthur van Hoff, Martin Buchholz, Ulf Zibis See Also: Implementation Note: The implementation of the string concatenation operator is left to
the discretion of a Java compiler, as long as the compiler ultimately conforms
to The Java™ Language Specification. For example, the javac
compiler may implement the operator with StringBuffer
, StringBuilder
, or java.lang.invoke.StringConcatFactory
depending on the JDK version. The implementation of string conversion is typically through the method toString
, defined by Object
and inherited by all classes in Java. Since: 1.0 @jls 15.18.1 String Concatenation Operator +
/**
* The {@code String} class represents character strings. All
* string literals in Java programs, such as {@code "abc"}, are
* implemented as instances of this class.
* <p>
* Strings are constant; their values cannot be changed after they
* are created. String buffers support mutable strings.
* Because String objects are immutable they can be shared. For example:
* <blockquote><pre>
* String str = "abc";
* </pre></blockquote><p>
* is equivalent to:
* <blockquote><pre>
* char data[] = {'a', 'b', 'c'};
* String str = new String(data);
* </pre></blockquote><p>
* Here are some more examples of how strings can be used:
* <blockquote><pre>
* System.out.println("abc");
* String cde = "cde";
* System.out.println("abc" + cde);
* String c = "abc".substring(2, 3);
* String d = cde.substring(1, 2);
* </pre></blockquote>
* <p>
* The class {@code String} includes methods for examining
* individual characters of the sequence, for comparing strings, for
* searching strings, for extracting substrings, and for creating a
* copy of a string with all characters translated to uppercase or to
* lowercase. Case mapping is based on the Unicode Standard version
* specified by the {@link java.lang.Character Character} class.
* <p>
* The Java language provides special support for the string
* concatenation operator ( + ), and for conversion of
* other objects to strings. For additional information on string
* concatenation and conversion, see <i>The Java™ Language Specification</i>.
*
* <p> Unless otherwise noted, passing a {@code null} argument to a constructor
* or method in this class will cause a {@link NullPointerException} to be
* thrown.
*
* <p>A {@code String} represents a string in the UTF-16 format
* in which <em>supplementary characters</em> are represented by <em>surrogate
* pairs</em> (see the section <a href="Character.html#unicode">Unicode
* Character Representations</a> in the {@code Character} class for
* more information).
* Index values refer to {@code char} code units, so a supplementary
* character uses two positions in a {@code String}.
* <p>The {@code String} class provides methods for dealing with
* Unicode code points (i.e., characters), in addition to those for
* dealing with Unicode code units (i.e., {@code char} values).
*
* <p>Unless otherwise noted, methods for comparing Strings do not take locale
* into account. The {@link java.text.Collator} class provides methods for
* finer-grain, locale-sensitive String comparison.
*
* @implNote The implementation of the string concatenation operator is left to
* the discretion of a Java compiler, as long as the compiler ultimately conforms
* to <i>The Java™ Language Specification</i>. For example, the {@code javac} compiler
* may implement the operator with {@code StringBuffer}, {@code StringBuilder},
* or {@code java.lang.invoke.StringConcatFactory} depending on the JDK version. The
* implementation of string conversion is typically through the method {@code toString},
* defined by {@code Object} and inherited by all classes in Java.
*
* @author Lee Boynton
* @author Arthur van Hoff
* @author Martin Buchholz
* @author Ulf Zibis
* @see java.lang.Object#toString()
* @see java.lang.StringBuffer
* @see java.lang.StringBuilder
* @see java.nio.charset.Charset
* @since 1.0
* @jls 15.18.1 String Concatenation Operator +
*/
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence,
Constable, ConstantDesc {
The value is used for character storage.
Implementation Note: This field is trusted by the VM, and is a subject to constant folding if String instance is constant. Overwriting this field after construction will cause problems. Additionally, it is marked with Stable
to trust the contents of the array. No other facility in JDK provides this functionality (yet). Stable
is safe here, because value is never null.
/**
* The value is used for character storage.
*
* @implNote This field is trusted by the VM, and is a subject to
* constant folding if String instance is constant. Overwriting this
* field after construction will cause problems.
*
* Additionally, it is marked with {@link Stable} to trust the contents
* of the array. No other facility in JDK provides this functionality (yet).
* {@link Stable} is safe here, because value is never null.
*/
@Stable
private final byte[] value;
The identifier of the encoding used to encode the bytes in value
. The supported values in this implementation are LATIN1 UTF16 Implementation Note: This field is trusted by the VM, and is a subject to
constant folding if String instance is constant. Overwriting this
field after construction will cause problems.
/**
* The identifier of the encoding used to encode the bytes in
* {@code value}. The supported values in this implementation are
*
* LATIN1
* UTF16
*
* @implNote This field is trusted by the VM, and is a subject to
* constant folding if String instance is constant. Overwriting this
* field after construction will cause problems.
*/
private final byte coder;
Cache the hash code for the string /** Cache the hash code for the string */
private int hash; // Default to 0
Cache if the hash has been calculated as actually being zero, enabling
us to avoid recalculating this.
/**
* Cache if the hash has been calculated as actually being zero, enabling
* us to avoid recalculating this.
*/
private boolean hashIsZero; // Default to false;
use serialVersionUID from JDK 1.0.2 for interoperability /** use serialVersionUID from JDK 1.0.2 for interoperability */
@java.io.Serial
private static final long serialVersionUID = -6849794470754667710L;
If String compaction is disabled, the bytes in value
are always encoded in UTF16. For methods with several possible implementation paths, when String compaction is disabled, only one code path is taken. The instance field value is generally opaque to optimizing JIT compilers. Therefore, in performance-sensitive place, an explicit check of the static boolean COMPACT_STRINGS
is done first before checking the coder
field since the static boolean COMPACT_STRINGS
would be constant folded away by an optimizing JIT compiler. The idioms for these cases are as follows. For code such as: if (coder == LATIN1) { ... } can be written more optimally as if (coder() == LATIN1) { ... } or: if (COMPACT_STRINGS && coder == LATIN1) { ... } An optimizing JIT compiler can fold the above conditional as: COMPACT_STRINGS == true => if (coder == LATIN1) { ... } COMPACT_STRINGS == false => if (false) { ... } Implementation Note:
The actual value for this field is injected by JVM. The static
initialization block is used to set the value here to communicate
that this static final field is not statically foldable, and to
avoid any possible circular dependency during vm initialization.
/**
* If String compaction is disabled, the bytes in {@code value} are
* always encoded in UTF16.
*
* For methods with several possible implementation paths, when String
* compaction is disabled, only one code path is taken.
*
* The instance field value is generally opaque to optimizing JIT
* compilers. Therefore, in performance-sensitive place, an explicit
* check of the static boolean {@code COMPACT_STRINGS} is done first
* before checking the {@code coder} field since the static boolean
* {@code COMPACT_STRINGS} would be constant folded away by an
* optimizing JIT compiler. The idioms for these cases are as follows.
*
* For code such as:
*
* if (coder == LATIN1) { ... }
*
* can be written more optimally as
*
* if (coder() == LATIN1) { ... }
*
* or:
*
* if (COMPACT_STRINGS && coder == LATIN1) { ... }
*
* An optimizing JIT compiler can fold the above conditional as:
*
* COMPACT_STRINGS == true => if (coder == LATIN1) { ... }
* COMPACT_STRINGS == false => if (false) { ... }
*
* @implNote
* The actual value for this field is injected by JVM. The static
* initialization block is used to set the value here to communicate
* that this static final field is not statically foldable, and to
* avoid any possible circular dependency during vm initialization.
*/
static final boolean COMPACT_STRINGS;
static {
COMPACT_STRINGS = true;
}
Class String is special cased within the Serialization Stream Protocol.
A String instance is written into an ObjectOutputStream according to
Object Serialization Specification, Section 6.2, "Stream Elements"
/**
* Class String is special cased within the Serialization Stream Protocol.
*
* A String instance is written into an ObjectOutputStream according to
* <a href="{@docRoot}/../specs/serialization/protocol.html#stream-elements">
* Object Serialization Specification, Section 6.2, "Stream Elements"</a>
*/
@java.io.Serial
private static final ObjectStreamField[] serialPersistentFields =
new ObjectStreamField[0];
Initializes a newly created String
object so that it represents an empty character sequence. Note that use of this constructor is unnecessary since Strings are immutable. /**
* Initializes a newly created {@code String} object so that it represents
* an empty character sequence. Note that use of this constructor is
* unnecessary since Strings are immutable.
*/
public String() {
this.value = "".value;
this.coder = "".coder;
}
Initializes a newly created String
object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original
is needed, use of this constructor is unnecessary since Strings are immutable. Params: - original – A
String
/**
* Initializes a newly created {@code String} object so that it represents
* the same sequence of characters as the argument; in other words, the
* newly created string is a copy of the argument string. Unless an
* explicit copy of {@code original} is needed, use of this constructor is
* unnecessary since Strings are immutable.
*
* @param original
* A {@code String}
*/
@HotSpotIntrinsicCandidate
public String(String original) {
this.value = original.value;
this.coder = original.coder;
this.hash = original.hash;
}
Allocates a new String
so that it represents the sequence of characters currently contained in the character array argument. The contents of the character array are copied; subsequent modification of the character array does not affect the newly created string. Params: - value –
The initial value of the string
/**
* Allocates a new {@code String} so that it represents the sequence of
* characters currently contained in the character array argument. The
* contents of the character array are copied; subsequent modification of
* the character array does not affect the newly created string.
*
* @param value
* The initial value of the string
*/
public String(char value[]) {
this(value, 0, value.length, null);
}
Allocates a new String
that contains characters from a subarray of the character array argument. The offset
argument is the index of the first character of the subarray and the count
argument specifies the length of the subarray. The contents of the subarray are copied; subsequent modification of the character array does not affect the newly created string. Params: - value –
Array that is the source of characters
- offset –
The initial offset
- count –
The length
Throws: - IndexOutOfBoundsException – If
offset
is negative, count
is negative, or offset
is greater than value.length - count
/**
* Allocates a new {@code String} that contains characters from a subarray
* of the character array argument. The {@code offset} argument is the
* index of the first character of the subarray and the {@code count}
* argument specifies the length of the subarray. The contents of the
* subarray are copied; subsequent modification of the character array does
* not affect the newly created string.
*
* @param value
* Array that is the source of characters
*
* @param offset
* The initial offset
*
* @param count
* The length
*
* @throws IndexOutOfBoundsException
* If {@code offset} is negative, {@code count} is negative, or
* {@code offset} is greater than {@code value.length - count}
*/
public String(char value[], int offset, int count) {
this(value, offset, count, rangeCheck(value, offset, count));
}
private static Void rangeCheck(char[] value, int offset, int count) {
checkBoundsOffCount(offset, count, value.length);
return null;
}
Allocates a new String
that contains characters from a subarray of the Unicode code point array argument. The offset
argument is the index of the first code point of the subarray and the count
argument specifies the length of the subarray. The contents of the subarray are converted to char
s; subsequent modification of the int
array does not affect the newly created string. Params: - codePoints –
Array that is the source of Unicode code points
- offset –
The initial offset
- count –
The length
Throws: - IllegalArgumentException – If any invalid Unicode code point is found in
codePoints
- IndexOutOfBoundsException – If
offset
is negative, count
is negative, or offset
is greater than codePoints.length - count
Since: 1.5
/**
* Allocates a new {@code String} that contains characters from a subarray
* of the <a href="Character.html#unicode">Unicode code point</a> array
* argument. The {@code offset} argument is the index of the first code
* point of the subarray and the {@code count} argument specifies the
* length of the subarray. The contents of the subarray are converted to
* {@code char}s; subsequent modification of the {@code int} array does not
* affect the newly created string.
*
* @param codePoints
* Array that is the source of Unicode code points
*
* @param offset
* The initial offset
*
* @param count
* The length
*
* @throws IllegalArgumentException
* If any invalid Unicode code point is found in {@code
* codePoints}
*
* @throws IndexOutOfBoundsException
* If {@code offset} is negative, {@code count} is negative, or
* {@code offset} is greater than {@code codePoints.length - count}
*
* @since 1.5
*/
public String(int[] codePoints, int offset, int count) {
checkBoundsOffCount(offset, count, codePoints.length);
if (count == 0) {
this.value = "".value;
this.coder = "".coder;
return;
}
if (COMPACT_STRINGS) {
byte[] val = StringLatin1.toBytes(codePoints, offset, count);
if (val != null) {
this.coder = LATIN1;
this.value = val;
return;
}
}
this.coder = UTF16;
this.value = StringUTF16.toBytes(codePoints, offset, count);
}
Allocates a new String
constructed from a subarray of an array of 8-bit integer values. The offset
argument is the index of the first byte of the subarray, and the count
argument specifies the length of the subarray.
Each byte
in the subarray is converted to a char
as specified in the String(byte[],int)
constructor.
Params: - ascii –
The bytes to be converted to characters
- hibyte –
The top 8 bits of each 16-bit Unicode code unit
- offset –
The initial offset
- count –
The length
Throws: - IndexOutOfBoundsException – If
offset
is negative, count
is negative, or offset
is greater than ascii.length - count
See Also: Deprecated: This method does not properly convert bytes into characters. As of JDK 1.1, the preferred way to do this is via the String
constructors that take a Charset
, charset name, or that use the platform's default charset.
/**
* Allocates a new {@code String} constructed from a subarray of an array
* of 8-bit integer values.
*
* <p> The {@code offset} argument is the index of the first byte of the
* subarray, and the {@code count} argument specifies the length of the
* subarray.
*
* <p> Each {@code byte} in the subarray is converted to a {@code char} as
* specified in the {@link #String(byte[],int) String(byte[],int)} constructor.
*
* @deprecated This method does not properly convert bytes into characters.
* As of JDK 1.1, the preferred way to do this is via the
* {@code String} constructors that take a {@link
* java.nio.charset.Charset}, charset name, or that use the platform's
* default charset.
*
* @param ascii
* The bytes to be converted to characters
*
* @param hibyte
* The top 8 bits of each 16-bit Unicode code unit
*
* @param offset
* The initial offset
* @param count
* The length
*
* @throws IndexOutOfBoundsException
* If {@code offset} is negative, {@code count} is negative, or
* {@code offset} is greater than {@code ascii.length - count}
*
* @see #String(byte[], int)
* @see #String(byte[], int, int, java.lang.String)
* @see #String(byte[], int, int, java.nio.charset.Charset)
* @see #String(byte[], int, int)
* @see #String(byte[], java.lang.String)
* @see #String(byte[], java.nio.charset.Charset)
* @see #String(byte[])
*/
@Deprecated(since="1.1")
public String(byte ascii[], int hibyte, int offset, int count) {
checkBoundsOffCount(offset, count, ascii.length);
if (count == 0) {
this.value = "".value;
this.coder = "".coder;
return;
}
if (COMPACT_STRINGS && (byte)hibyte == 0) {
this.value = Arrays.copyOfRange(ascii, offset, offset + count);
this.coder = LATIN1;
} else {
hibyte <<= 8;
byte[] val = StringUTF16.newBytesFor(count);
for (int i = 0; i < count; i++) {
StringUTF16.putChar(val, i, hibyte | (ascii[offset++] & 0xff));
}
this.value = val;
this.coder = UTF16;
}
}
Allocates a new String
containing characters constructed from an array of 8-bit integer values. Each character c in the
resulting string is constructed from the corresponding component
b in the byte array such that:
c == (char)(((hibyte & 0xff) << 8)
| (b & 0xff))
Params: - ascii –
The bytes to be converted to characters
- hibyte –
The top 8 bits of each 16-bit Unicode code unit
See Also: Deprecated: This method does not properly convert bytes into characters. As of JDK 1.1, the preferred way to do this is via the String
constructors that take a Charset
, charset name, or that use the platform's default charset.
/**
* Allocates a new {@code String} containing characters constructed from
* an array of 8-bit integer values. Each character <i>c</i> in the
* resulting string is constructed from the corresponding component
* <i>b</i> in the byte array such that:
*
* <blockquote><pre>
* <b><i>c</i></b> == (char)(((hibyte & 0xff) << 8)
* | (<b><i>b</i></b> & 0xff))
* </pre></blockquote>
*
* @deprecated This method does not properly convert bytes into
* characters. As of JDK 1.1, the preferred way to do this is via the
* {@code String} constructors that take a {@link
* java.nio.charset.Charset}, charset name, or that use the platform's
* default charset.
*
* @param ascii
* The bytes to be converted to characters
*
* @param hibyte
* The top 8 bits of each 16-bit Unicode code unit
*
* @see #String(byte[], int, int, java.lang.String)
* @see #String(byte[], int, int, java.nio.charset.Charset)
* @see #String(byte[], int, int)
* @see #String(byte[], java.lang.String)
* @see #String(byte[], java.nio.charset.Charset)
* @see #String(byte[])
*/
@Deprecated(since="1.1")
public String(byte ascii[], int hibyte) {
this(ascii, hibyte, 0, ascii.length);
}
Constructs a new String
by decoding the specified subarray of bytes using the specified charset. The length of the new String
is a function of the charset, and hence may not be equal to the length of the subarray. The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. The CharsetDecoder
class should be used when more control over the decoding process is required.
Params: - bytes –
The bytes to be decoded into characters
- offset –
The index of the first byte to decode
- length –
The number of bytes to decode
- charsetName – The name of a supported
charset
Throws: - UnsupportedEncodingException –
If the named charset is not supported
- IndexOutOfBoundsException – If
offset
is negative, length
is negative, or offset
is greater than bytes.length - length
Since: 1.1
/**
* Constructs a new {@code String} by decoding the specified subarray of
* bytes using the specified charset. The length of the new {@code String}
* is a function of the charset, and hence may not be equal to the length
* of the subarray.
*
* <p> The behavior of this constructor when the given bytes are not valid
* in the given charset is unspecified. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @param offset
* The index of the first byte to decode
*
* @param length
* The number of bytes to decode
* @param charsetName
* The name of a supported {@linkplain java.nio.charset.Charset
* charset}
*
* @throws UnsupportedEncodingException
* If the named charset is not supported
*
* @throws IndexOutOfBoundsException
* If {@code offset} is negative, {@code length} is negative, or
* {@code offset} is greater than {@code bytes.length - length}
*
* @since 1.1
*/
public String(byte bytes[], int offset, int length, String charsetName)
throws UnsupportedEncodingException {
if (charsetName == null)
throw new NullPointerException("charsetName");
checkBoundsOffCount(offset, length, bytes.length);
StringCoding.Result ret =
StringCoding.decode(charsetName, bytes, offset, length);
this.value = ret.value;
this.coder = ret.coder;
}
Constructs a new String
by decoding the specified subarray of bytes using the specified charset. The length of the new String
is a function of the charset, and hence may not be equal to the length of the subarray. This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement string. The CharsetDecoder
class should be used when more control over the decoding process is required.
Params: - bytes –
The bytes to be decoded into characters
- offset –
The index of the first byte to decode
- length –
The number of bytes to decode
- charset – The charset to be used to decode the
bytes
Throws: - IndexOutOfBoundsException – If
offset
is negative, length
is negative, or offset
is greater than bytes.length - length
Since: 1.6
/**
* Constructs a new {@code String} by decoding the specified subarray of
* bytes using the specified {@linkplain java.nio.charset.Charset charset}.
* The length of the new {@code String} is a function of the charset, and
* hence may not be equal to the length of the subarray.
*
* <p> This method always replaces malformed-input and unmappable-character
* sequences with this charset's default replacement string. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @param offset
* The index of the first byte to decode
*
* @param length
* The number of bytes to decode
*
* @param charset
* The {@linkplain java.nio.charset.Charset charset} to be used to
* decode the {@code bytes}
*
* @throws IndexOutOfBoundsException
* If {@code offset} is negative, {@code length} is negative, or
* {@code offset} is greater than {@code bytes.length - length}
*
* @since 1.6
*/
public String(byte bytes[], int offset, int length, Charset charset) {
if (charset == null)
throw new NullPointerException("charset");
checkBoundsOffCount(offset, length, bytes.length);
StringCoding.Result ret =
StringCoding.decode(charset, bytes, offset, length);
this.value = ret.value;
this.coder = ret.coder;
}
Constructs a new String
by decoding the specified array of bytes using the specified charset. The length of the new String
is a function of the charset, and hence may not be equal to the length of the byte array. The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. The CharsetDecoder
class should be used when more control over the decoding process is required.
Params: - bytes –
The bytes to be decoded into characters
- charsetName – The name of a supported
charset
Throws: - UnsupportedEncodingException –
If the named charset is not supported
Since: 1.1
/**
* Constructs a new {@code String} by decoding the specified array of bytes
* using the specified {@linkplain java.nio.charset.Charset charset}. The
* length of the new {@code String} is a function of the charset, and hence
* may not be equal to the length of the byte array.
*
* <p> The behavior of this constructor when the given bytes are not valid
* in the given charset is unspecified. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @param charsetName
* The name of a supported {@linkplain java.nio.charset.Charset
* charset}
*
* @throws UnsupportedEncodingException
* If the named charset is not supported
*
* @since 1.1
*/
public String(byte bytes[], String charsetName)
throws UnsupportedEncodingException {
this(bytes, 0, bytes.length, charsetName);
}
Constructs a new String
by decoding the specified array of bytes using the specified charset. The length of the new String
is a function of the charset, and hence may not be equal to the length of the byte array. This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement string. The CharsetDecoder
class should be used when more control over the decoding process is required.
Params: - bytes –
The bytes to be decoded into characters
- charset – The charset to be used to decode the
bytes
Since: 1.6
/**
* Constructs a new {@code String} by decoding the specified array of
* bytes using the specified {@linkplain java.nio.charset.Charset charset}.
* The length of the new {@code String} is a function of the charset, and
* hence may not be equal to the length of the byte array.
*
* <p> This method always replaces malformed-input and unmappable-character
* sequences with this charset's default replacement string. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @param charset
* The {@linkplain java.nio.charset.Charset charset} to be used to
* decode the {@code bytes}
*
* @since 1.6
*/
public String(byte bytes[], Charset charset) {
this(bytes, 0, bytes.length, charset);
}
Constructs a new String
by decoding the specified subarray of bytes using the platform's default charset. The length of the new String
is a function of the charset, and hence may not be equal to the length of the subarray. The behavior of this constructor when the given bytes are not valid in the default charset is unspecified. The CharsetDecoder
class should be used when more control over the decoding process is required.
Params: - bytes –
The bytes to be decoded into characters
- offset –
The index of the first byte to decode
- length –
The number of bytes to decode
Throws: - IndexOutOfBoundsException – If
offset
is negative, length
is negative, or offset
is greater than bytes.length - length
Since: 1.1
/**
* Constructs a new {@code String} by decoding the specified subarray of
* bytes using the platform's default charset. The length of the new
* {@code String} is a function of the charset, and hence may not be equal
* to the length of the subarray.
*
* <p> The behavior of this constructor when the given bytes are not valid
* in the default charset is unspecified. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @param offset
* The index of the first byte to decode
*
* @param length
* The number of bytes to decode
*
* @throws IndexOutOfBoundsException
* If {@code offset} is negative, {@code length} is negative, or
* {@code offset} is greater than {@code bytes.length - length}
*
* @since 1.1
*/
public String(byte bytes[], int offset, int length) {
checkBoundsOffCount(offset, length, bytes.length);
StringCoding.Result ret = StringCoding.decode(bytes, offset, length);
this.value = ret.value;
this.coder = ret.coder;
}
Constructs a new String
by decoding the specified array of bytes using the platform's default charset. The length of the new
String
is a function of the charset, and hence may not be equal to the length of the byte array. The behavior of this constructor when the given bytes are not valid in the default charset is unspecified. The CharsetDecoder
class should be used when more control over the decoding process is required.
Params: - bytes –
The bytes to be decoded into characters
Since: 1.1
/**
* Constructs a new {@code String} by decoding the specified array of bytes
* using the platform's default charset. The length of the new {@code
* String} is a function of the charset, and hence may not be equal to the
* length of the byte array.
*
* <p> The behavior of this constructor when the given bytes are not valid
* in the default charset is unspecified. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @since 1.1
*/
public String(byte[] bytes) {
this(bytes, 0, bytes.length);
}
Allocates a new string that contains the sequence of characters
currently contained in the string buffer argument. The contents of the
string buffer are copied; subsequent modification of the string buffer
does not affect the newly created string.
Params: - buffer – A
StringBuffer
/**
* Allocates a new string that contains the sequence of characters
* currently contained in the string buffer argument. The contents of the
* string buffer are copied; subsequent modification of the string buffer
* does not affect the newly created string.
*
* @param buffer
* A {@code StringBuffer}
*/
public String(StringBuffer buffer) {
this(buffer.toString());
}
Allocates a new string that contains the sequence of characters
currently contained in the string builder argument. The contents of the
string builder are copied; subsequent modification of the string builder
does not affect the newly created string.
This constructor is provided to ease migration to
StringBuilder
. Obtaining a string from a string builder via the
toString
method is likely to run faster and is generally preferred.
Params: - builder – A
StringBuilder
Since: 1.5
/**
* Allocates a new string that contains the sequence of characters
* currently contained in the string builder argument. The contents of the
* string builder are copied; subsequent modification of the string builder
* does not affect the newly created string.
*
* <p> This constructor is provided to ease migration to {@code
* StringBuilder}. Obtaining a string from a string builder via the {@code
* toString} method is likely to run faster and is generally preferred.
*
* @param builder
* A {@code StringBuilder}
*
* @since 1.5
*/
public String(StringBuilder builder) {
this(builder, null);
}
Returns the length of this string.
The length is equal to the number of Unicode
code units in the string.
Returns: the length of the sequence of characters represented by this
object.
/**
* Returns the length of this string.
* The length is equal to the number of <a href="Character.html#unicode">Unicode
* code units</a> in the string.
*
* @return the length of the sequence of characters represented by this
* object.
*/
public int length() {
return value.length >> coder();
}
Returns: true
if length()
is 0
, otherwise false
Since: 1.6
/**
* Returns {@code true} if, and only if, {@link #length()} is {@code 0}.
*
* @return {@code true} if {@link #length()} is {@code 0}, otherwise
* {@code false}
*
* @since 1.6
*/
public boolean isEmpty() {
return value.length == 0;
}
Returns the char
value at the specified index. An index ranges from 0
to length() - 1
. The first char
value of the sequence is at index 0
, the next at index 1
, and so on, as for array indexing. If the char
value specified by the index is a surrogate, the surrogate
value is returned.
Params: - index – the index of the
char
value.
Throws: - IndexOutOfBoundsException – if the
index
argument is negative or not less than the length of this string.
Returns: the char
value at the specified index of this string. The first char
value is at index 0
.
/**
* Returns the {@code char} value at the
* specified index. An index ranges from {@code 0} to
* {@code length() - 1}. The first {@code char} value of the sequence
* is at index {@code 0}, the next at index {@code 1},
* and so on, as for array indexing.
*
* <p>If the {@code char} value specified by the index is a
* <a href="Character.html#unicode">surrogate</a>, the surrogate
* value is returned.
*
* @param index the index of the {@code char} value.
* @return the {@code char} value at the specified index of this string.
* The first {@code char} value is at index {@code 0}.
* @throws IndexOutOfBoundsException if the {@code index}
* argument is negative or not less than the length of this
* string.
*/
public char charAt(int index) {
if (isLatin1()) {
return StringLatin1.charAt(value, index);
} else {
return StringUTF16.charAt(value, index);
}
}
Returns the character (Unicode code point) at the specified index. The index refers to char
values (Unicode code units) and ranges from 0
to length()
- 1
. If the char
value specified at the given index is in the high-surrogate range, the following index is less than the length of this String
, and the char
value at the following index is in the low-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the char
value at the given index is returned.
Params: - index – the index to the
char
values
Throws: - IndexOutOfBoundsException – if the
index
argument is negative or not less than the length of this string.
Returns: the code point value of the character at the index
Since: 1.5
/**
* Returns the character (Unicode code point) at the specified
* index. The index refers to {@code char} values
* (Unicode code units) and ranges from {@code 0} to
* {@link #length()}{@code - 1}.
*
* <p> If the {@code char} value specified at the given index
* is in the high-surrogate range, the following index is less
* than the length of this {@code String}, and the
* {@code char} value at the following index is in the
* low-surrogate range, then the supplementary code point
* corresponding to this surrogate pair is returned. Otherwise,
* the {@code char} value at the given index is returned.
*
* @param index the index to the {@code char} values
* @return the code point value of the character at the
* {@code index}
* @throws IndexOutOfBoundsException if the {@code index}
* argument is negative or not less than the length of this
* string.
* @since 1.5
*/
public int codePointAt(int index) {
if (isLatin1()) {
checkIndex(index, value.length);
return value[index] & 0xff;
}
int length = value.length >> 1;
checkIndex(index, length);
return StringUTF16.codePointAt(value, index, length);
}
Returns the character (Unicode code point) before the specified index. The index refers to char
values (Unicode code units) and ranges from 1
to length
. If the char
value at (index - 1)
is in the low-surrogate range, (index - 2)
is not negative, and the char
value at (index -
2)
is in the high-surrogate range, then the supplementary code point value of the surrogate pair is returned. If the char
value at index -
1
is an unpaired low-surrogate or a high-surrogate, the surrogate value is returned.
Params: - index – the index following the code point that should be returned
Throws: - IndexOutOfBoundsException – if the
index
argument is less than 1 or greater than the length of this string.
Returns: the Unicode code point value before the given index. Since: 1.5
/**
* Returns the character (Unicode code point) before the specified
* index. The index refers to {@code char} values
* (Unicode code units) and ranges from {@code 1} to {@link
* CharSequence#length() length}.
*
* <p> If the {@code char} value at {@code (index - 1)}
* is in the low-surrogate range, {@code (index - 2)} is not
* negative, and the {@code char} value at {@code (index -
* 2)} is in the high-surrogate range, then the
* supplementary code point value of the surrogate pair is
* returned. If the {@code char} value at {@code index -
* 1} is an unpaired low-surrogate or a high-surrogate, the
* surrogate value is returned.
*
* @param index the index following the code point that should be returned
* @return the Unicode code point value before the given index.
* @throws IndexOutOfBoundsException if the {@code index}
* argument is less than 1 or greater than the length
* of this string.
* @since 1.5
*/
public int codePointBefore(int index) {
int i = index - 1;
if (i < 0 || i >= length()) {
throw new StringIndexOutOfBoundsException(index);
}
if (isLatin1()) {
return (value[i] & 0xff);
}
return StringUTF16.codePointBefore(value, index);
}
Returns the number of Unicode code points in the specified text range of this String
. The text range begins at the specified beginIndex
and extends to the char
at index endIndex - 1
. Thus the length (in char
s) of the text range is endIndex-beginIndex
. Unpaired surrogates within the text range count as one code point each. Params: - beginIndex – the index to the first
char
of the text range. - endIndex – the index after the last
char
of the text range.
Throws: - IndexOutOfBoundsException – if the
beginIndex
is negative, or endIndex
is larger than the length of this String
, or beginIndex
is larger than endIndex
.
Returns: the number of Unicode code points in the specified text
range Since: 1.5
/**
* Returns the number of Unicode code points in the specified text
* range of this {@code String}. The text range begins at the
* specified {@code beginIndex} and extends to the
* {@code char} at index {@code endIndex - 1}. Thus the
* length (in {@code char}s) of the text range is
* {@code endIndex-beginIndex}. Unpaired surrogates within
* the text range count as one code point each.
*
* @param beginIndex the index to the first {@code char} of
* the text range.
* @param endIndex the index after the last {@code char} of
* the text range.
* @return the number of Unicode code points in the specified text
* range
* @throws IndexOutOfBoundsException if the
* {@code beginIndex} is negative, or {@code endIndex}
* is larger than the length of this {@code String}, or
* {@code beginIndex} is larger than {@code endIndex}.
* @since 1.5
*/
public int codePointCount(int beginIndex, int endIndex) {
if (beginIndex < 0 || beginIndex > endIndex ||
endIndex > length()) {
throw new IndexOutOfBoundsException();
}
if (isLatin1()) {
return endIndex - beginIndex;
}
return StringUTF16.codePointCount(value, beginIndex, endIndex);
}
Returns the index within this String
that is offset from the given index
by codePointOffset
code points. Unpaired surrogates within the text range given by index
and codePointOffset
count as one code point each. Params: - index – the index to be offset
- codePointOffset – the offset in code points
Throws: - IndexOutOfBoundsException – if
index
is negative or larger then the length of this String
, or if codePointOffset
is positive and the substring starting with index
has fewer than codePointOffset
code points, or if codePointOffset
is negative and the substring before index
has fewer than the absolute value of codePointOffset
code points.
Returns: the index within this String
Since: 1.5
/**
* Returns the index within this {@code String} that is
* offset from the given {@code index} by
* {@code codePointOffset} code points. Unpaired surrogates
* within the text range given by {@code index} and
* {@code codePointOffset} count as one code point each.
*
* @param index the index to be offset
* @param codePointOffset the offset in code points
* @return the index within this {@code String}
* @throws IndexOutOfBoundsException if {@code index}
* is negative or larger then the length of this
* {@code String}, or if {@code codePointOffset} is positive
* and the substring starting with {@code index} has fewer
* than {@code codePointOffset} code points,
* or if {@code codePointOffset} is negative and the substring
* before {@code index} has fewer than the absolute value
* of {@code codePointOffset} code points.
* @since 1.5
*/
public int offsetByCodePoints(int index, int codePointOffset) {
if (index < 0 || index > length()) {
throw new IndexOutOfBoundsException();
}
return Character.offsetByCodePoints(this, index, codePointOffset);
}
Copies characters from this string into the destination character
array.
The first character to be copied is at index srcBegin
; the last character to be copied is at index srcEnd-1
(thus the total number of characters to be copied is srcEnd-srcBegin
). The characters are copied into the subarray of dst
starting at index dstBegin
and ending at index:
dstBegin + (srcEnd-srcBegin) - 1
Params: - srcBegin – index of the first character in the string
to copy.
- srcEnd – index after the last character in the string
to copy.
- dst – the destination array.
- dstBegin – the start offset in the destination array.
Throws: - IndexOutOfBoundsException – If any of the following
is true:
srcBegin
is negative. srcBegin
is greater than srcEnd
srcEnd
is greater than the length of this string dstBegin
is negative dstBegin+(srcEnd-srcBegin)
is larger than dst.length
/**
* Copies characters from this string into the destination character
* array.
* <p>
* The first character to be copied is at index {@code srcBegin};
* the last character to be copied is at index {@code srcEnd-1}
* (thus the total number of characters to be copied is
* {@code srcEnd-srcBegin}). The characters are copied into the
* subarray of {@code dst} starting at index {@code dstBegin}
* and ending at index:
* <blockquote><pre>
* dstBegin + (srcEnd-srcBegin) - 1
* </pre></blockquote>
*
* @param srcBegin index of the first character in the string
* to copy.
* @param srcEnd index after the last character in the string
* to copy.
* @param dst the destination array.
* @param dstBegin the start offset in the destination array.
* @throws IndexOutOfBoundsException If any of the following
* is true:
* <ul><li>{@code srcBegin} is negative.
* <li>{@code srcBegin} is greater than {@code srcEnd}
* <li>{@code srcEnd} is greater than the length of this
* string
* <li>{@code dstBegin} is negative
* <li>{@code dstBegin+(srcEnd-srcBegin)} is larger than
* {@code dst.length}</ul>
*/
public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
checkBoundsBeginEnd(srcBegin, srcEnd, length());
checkBoundsOffCount(dstBegin, srcEnd - srcBegin, dst.length);
if (isLatin1()) {
StringLatin1.getChars(value, srcBegin, srcEnd, dst, dstBegin);
} else {
StringUTF16.getChars(value, srcBegin, srcEnd, dst, dstBegin);
}
}
Copies characters from this string into the destination byte array. Each
byte receives the 8 low-order bits of the corresponding character. The
eight high-order bits of each character are not copied and do not
participate in the transfer in any way.
The first character to be copied is at index srcBegin
; the last character to be copied is at index srcEnd-1
. The total number of characters to be copied is srcEnd-srcBegin
. The characters, converted to bytes, are copied into the subarray of
dst
starting at index dstBegin
and ending at index:
dstBegin + (srcEnd-srcBegin) - 1
Params: - srcBegin –
Index of the first character in the string to copy
- srcEnd –
Index after the last character in the string to copy
- dst –
The destination array
- dstBegin –
The start offset in the destination array
Throws: - IndexOutOfBoundsException –
If any of the following is true:
-
srcBegin
is negative -
srcBegin
is greater than srcEnd
-
srcEnd
is greater than the length of this String -
dstBegin
is negative -
dstBegin+(srcEnd-srcBegin)
is larger than
dst.length
Deprecated: This method does not properly convert characters into bytes. As of JDK 1.1, the preferred way to do this is via the getBytes()
method, which uses the platform's default charset.
/**
* Copies characters from this string into the destination byte array. Each
* byte receives the 8 low-order bits of the corresponding character. The
* eight high-order bits of each character are not copied and do not
* participate in the transfer in any way.
*
* <p> The first character to be copied is at index {@code srcBegin}; the
* last character to be copied is at index {@code srcEnd-1}. The total
* number of characters to be copied is {@code srcEnd-srcBegin}. The
* characters, converted to bytes, are copied into the subarray of {@code
* dst} starting at index {@code dstBegin} and ending at index:
*
* <blockquote><pre>
* dstBegin + (srcEnd-srcBegin) - 1
* </pre></blockquote>
*
* @deprecated This method does not properly convert characters into
* bytes. As of JDK 1.1, the preferred way to do this is via the
* {@link #getBytes()} method, which uses the platform's default charset.
*
* @param srcBegin
* Index of the first character in the string to copy
*
* @param srcEnd
* Index after the last character in the string to copy
*
* @param dst
* The destination array
*
* @param dstBegin
* The start offset in the destination array
*
* @throws IndexOutOfBoundsException
* If any of the following is true:
* <ul>
* <li> {@code srcBegin} is negative
* <li> {@code srcBegin} is greater than {@code srcEnd}
* <li> {@code srcEnd} is greater than the length of this String
* <li> {@code dstBegin} is negative
* <li> {@code dstBegin+(srcEnd-srcBegin)} is larger than {@code
* dst.length}
* </ul>
*/
@Deprecated(since="1.1")
public void getBytes(int srcBegin, int srcEnd, byte dst[], int dstBegin) {
checkBoundsBeginEnd(srcBegin, srcEnd, length());
Objects.requireNonNull(dst);
checkBoundsOffCount(dstBegin, srcEnd - srcBegin, dst.length);
if (isLatin1()) {
StringLatin1.getBytes(value, srcBegin, srcEnd, dst, dstBegin);
} else {
StringUTF16.getBytes(value, srcBegin, srcEnd, dst, dstBegin);
}
}
Encodes this String
into a sequence of bytes using the named charset, storing the result into a new byte array. The behavior of this method when this string cannot be encoded in the given charset is unspecified. The CharsetEncoder
class should be used when more control over the encoding process is required.
Params: - charsetName – The name of a supported
charset
Throws: - UnsupportedEncodingException –
If the named charset is not supported
Returns: The resultant byte array Since: 1.1
/**
* Encodes this {@code String} into a sequence of bytes using the named
* charset, storing the result into a new byte array.
*
* <p> The behavior of this method when this string cannot be encoded in
* the given charset is unspecified. The {@link
* java.nio.charset.CharsetEncoder} class should be used when more control
* over the encoding process is required.
*
* @param charsetName
* The name of a supported {@linkplain java.nio.charset.Charset
* charset}
*
* @return The resultant byte array
*
* @throws UnsupportedEncodingException
* If the named charset is not supported
*
* @since 1.1
*/
public byte[] getBytes(String charsetName)
throws UnsupportedEncodingException {
if (charsetName == null) throw new NullPointerException();
return StringCoding.encode(charsetName, coder(), value);
}
Encodes this String
into a sequence of bytes using the given charset, storing the result into a new byte array. This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement byte array. The CharsetEncoder
class should be used when more control over the encoding process is required.
Params: - charset – The Charset to be used to encode the
String
Returns: The resultant byte array Since: 1.6
/**
* Encodes this {@code String} into a sequence of bytes using the given
* {@linkplain java.nio.charset.Charset charset}, storing the result into a
* new byte array.
*
* <p> This method always replaces malformed-input and unmappable-character
* sequences with this charset's default replacement byte array. The
* {@link java.nio.charset.CharsetEncoder} class should be used when more
* control over the encoding process is required.
*
* @param charset
* The {@linkplain java.nio.charset.Charset} to be used to encode
* the {@code String}
*
* @return The resultant byte array
*
* @since 1.6
*/
public byte[] getBytes(Charset charset) {
if (charset == null) throw new NullPointerException();
return StringCoding.encode(charset, coder(), value);
}
Encodes this String
into a sequence of bytes using the platform's default charset, storing the result into a new byte array. The behavior of this method when this string cannot be encoded in the default charset is unspecified. The CharsetEncoder
class should be used when more control over the encoding process is required.
Returns: The resultant byte array Since: 1.1
/**
* Encodes this {@code String} into a sequence of bytes using the
* platform's default charset, storing the result into a new byte array.
*
* <p> The behavior of this method when this string cannot be encoded in
* the default charset is unspecified. The {@link
* java.nio.charset.CharsetEncoder} class should be used when more control
* over the encoding process is required.
*
* @return The resultant byte array
*
* @since 1.1
*/
public byte[] getBytes() {
return StringCoding.encode(coder(), value);
}
Compares this string to the specified object. The result is
true
if and only if the argument is not null
and is a
String
object that represents the same sequence of characters as this object. For finer-grained String comparison, refer to Collator
.
Params: - anObject – The object to compare this
String
against
See Also: Returns: true
if the given object represents a String
equivalent to this string, false
otherwise
/**
* Compares this string to the specified object. The result is {@code
* true} if and only if the argument is not {@code null} and is a {@code
* String} object that represents the same sequence of characters as this
* object.
*
* <p>For finer-grained String comparison, refer to
* {@link java.text.Collator}.
*
* @param anObject
* The object to compare this {@code String} against
*
* @return {@code true} if the given object represents a {@code String}
* equivalent to this string, {@code false} otherwise
*
* @see #compareTo(String)
* @see #equalsIgnoreCase(String)
*/
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String aString = (String)anObject;
if (!COMPACT_STRINGS || this.coder == aString.coder) {
return StringLatin1.equals(value, aString.value);
}
}
return false;
}
Compares this string to the specified StringBuffer
. The result is true
if and only if this String
represents the same sequence of characters as the specified StringBuffer
. This method synchronizes on the StringBuffer
. For finer-grained String comparison, refer to Collator
.
Params: - sb – The
StringBuffer
to compare this String
against
Returns: true
if this String
represents the same sequence of characters as the specified StringBuffer
, false
otherwise Since: 1.4
/**
* Compares this string to the specified {@code StringBuffer}. The result
* is {@code true} if and only if this {@code String} represents the same
* sequence of characters as the specified {@code StringBuffer}. This method
* synchronizes on the {@code StringBuffer}.
*
* <p>For finer-grained String comparison, refer to
* {@link java.text.Collator}.
*
* @param sb
* The {@code StringBuffer} to compare this {@code String} against
*
* @return {@code true} if this {@code String} represents the same
* sequence of characters as the specified {@code StringBuffer},
* {@code false} otherwise
*
* @since 1.4
*/
public boolean contentEquals(StringBuffer sb) {
return contentEquals((CharSequence)sb);
}
private boolean nonSyncContentEquals(AbstractStringBuilder sb) {
int len = length();
if (len != sb.length()) {
return false;
}
byte v1[] = value;
byte v2[] = sb.getValue();
byte coder = coder();
if (coder == sb.getCoder()) {
int n = v1.length;
for (int i = 0; i < n; i++) {
if (v1[i] != v2[i]) {
return false;
}
}
} else {
if (coder != LATIN1) { // utf16 str and latin1 abs can never be "equal"
return false;
}
return StringUTF16.contentEquals(v1, v2, len);
}
return true;
}
Compares this string to the specified CharSequence
. The result is true
if and only if this String
represents the same sequence of char values as the specified sequence. Note that if the CharSequence
is a StringBuffer
then the method synchronizes on it. For finer-grained String comparison, refer to Collator
.
Params: - cs – The sequence to compare this
String
against
Returns: true
if this String
represents the same sequence of char values as the specified sequence,
false
otherwise Since: 1.5
/**
* Compares this string to the specified {@code CharSequence}. The
* result is {@code true} if and only if this {@code String} represents the
* same sequence of char values as the specified sequence. Note that if the
* {@code CharSequence} is a {@code StringBuffer} then the method
* synchronizes on it.
*
* <p>For finer-grained String comparison, refer to
* {@link java.text.Collator}.
*
* @param cs
* The sequence to compare this {@code String} against
*
* @return {@code true} if this {@code String} represents the same
* sequence of char values as the specified sequence, {@code
* false} otherwise
*
* @since 1.5
*/
public boolean contentEquals(CharSequence cs) {
// Argument is a StringBuffer, StringBuilder
if (cs instanceof AbstractStringBuilder) {
if (cs instanceof StringBuffer) {
synchronized(cs) {
return nonSyncContentEquals((AbstractStringBuilder)cs);
}
} else {
return nonSyncContentEquals((AbstractStringBuilder)cs);
}
}
// Argument is a String
if (cs instanceof String) {
return equals(cs);
}
// Argument is a generic CharSequence
int n = cs.length();
if (n != length()) {
return false;
}
byte[] val = this.value;
if (isLatin1()) {
for (int i = 0; i < n; i++) {
if ((val[i] & 0xff) != cs.charAt(i)) {
return false;
}
}
} else {
if (!StringUTF16.contentEquals(val, cs, n)) {
return false;
}
}
return true;
}
Compares this String
to another String
, ignoring case considerations. Two strings are considered equal ignoring case if they are of the same length and corresponding characters in the two strings are equal ignoring case. Two characters c1
and c2
are considered the same ignoring case if at least one of the following is true:
- The two characters are the same (as compared by the
==
operator) - Calling
Character.toLowerCase(Character.toUpperCase(char))
on each character produces the same result
Note that this method does not take locale into account, and will result in unsatisfactory results for certain locales. The Collator
class provides locale-sensitive comparison.
Params: - anotherString – The
String
to compare this String
against
See Also: Returns: true
if the argument is not null
and it represents an equivalent String
ignoring case;
false
otherwise
/**
* Compares this {@code String} to another {@code String}, ignoring case
* considerations. Two strings are considered equal ignoring case if they
* are of the same length and corresponding characters in the two strings
* are equal ignoring case.
*
* <p> Two characters {@code c1} and {@code c2} are considered the same
* ignoring case if at least one of the following is true:
* <ul>
* <li> The two characters are the same (as compared by the
* {@code ==} operator)
* <li> Calling {@code Character.toLowerCase(Character.toUpperCase(char))}
* on each character produces the same result
* </ul>
*
* <p>Note that this method does <em>not</em> take locale into account, and
* will result in unsatisfactory results for certain locales. The
* {@link java.text.Collator} class provides locale-sensitive comparison.
*
* @param anotherString
* The {@code String} to compare this {@code String} against
*
* @return {@code true} if the argument is not {@code null} and it
* represents an equivalent {@code String} ignoring case; {@code
* false} otherwise
*
* @see #equals(Object)
*/
public boolean equalsIgnoreCase(String anotherString) {
return (this == anotherString) ? true
: (anotherString != null)
&& (anotherString.length() == length())
&& regionMatches(true, 0, anotherString, 0, length());
}
Compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings. The character sequence represented by this String
object is compared lexicographically to the character sequence represented by the argument string. The result is a negative integer if this String
object lexicographically precedes the argument string. The result is a positive integer if this String
object lexicographically follows the argument string. The result is zero if the strings are equal; compareTo
returns 0
exactly when the equals(Object)
method would return true
.
This is the definition of lexicographic ordering. If two strings are
different, then either they have different characters at some index
that is a valid index for both strings, or their lengths are different,
or both. If they have different characters at one or more index
positions, let k be the smallest such index; then the string
whose character at position k has the smaller value, as determined by using the <
operator, lexicographically precedes the other string. In this case, compareTo
returns the difference of the two character values at position k
in the two string -- that is, the value:
this.charAt(k)-anotherString.charAt(k)
If there is no index position at which they differ, then the shorter string lexicographically precedes the longer string. In this case, compareTo
returns the difference of the lengths of the strings -- that is, the value: this.length()-anotherString.length()
For finer-grained String comparison, refer to Collator
.
Params: - anotherString – the
String
to be compared.
Returns: the value 0
if the argument string is equal to this string; a value less than 0
if this string is lexicographically less than the string argument; and a value greater than 0
if this string is lexicographically greater than the string argument.
/**
* Compares two strings lexicographically.
* The comparison is based on the Unicode value of each character in
* the strings. The character sequence represented by this
* {@code String} object is compared lexicographically to the
* character sequence represented by the argument string. The result is
* a negative integer if this {@code String} object
* lexicographically precedes the argument string. The result is a
* positive integer if this {@code String} object lexicographically
* follows the argument string. The result is zero if the strings
* are equal; {@code compareTo} returns {@code 0} exactly when
* the {@link #equals(Object)} method would return {@code true}.
* <p>
* This is the definition of lexicographic ordering. If two strings are
* different, then either they have different characters at some index
* that is a valid index for both strings, or their lengths are different,
* or both. If they have different characters at one or more index
* positions, let <i>k</i> be the smallest such index; then the string
* whose character at position <i>k</i> has the smaller value, as
* determined by using the {@code <} operator, lexicographically precedes the
* other string. In this case, {@code compareTo} returns the
* difference of the two character values at position {@code k} in
* the two string -- that is, the value:
* <blockquote><pre>
* this.charAt(k)-anotherString.charAt(k)
* </pre></blockquote>
* If there is no index position at which they differ, then the shorter
* string lexicographically precedes the longer string. In this case,
* {@code compareTo} returns the difference of the lengths of the
* strings -- that is, the value:
* <blockquote><pre>
* this.length()-anotherString.length()
* </pre></blockquote>
*
* <p>For finer-grained String comparison, refer to
* {@link java.text.Collator}.
*
* @param anotherString the {@code String} to be compared.
* @return the value {@code 0} if the argument string is equal to
* this string; a value less than {@code 0} if this string
* is lexicographically less than the string argument; and a
* value greater than {@code 0} if this string is
* lexicographically greater than the string argument.
*/
public int compareTo(String anotherString) {
byte v1[] = value;
byte v2[] = anotherString.value;
byte coder = coder();
if (coder == anotherString.coder()) {
return coder == LATIN1 ? StringLatin1.compareTo(v1, v2)
: StringUTF16.compareTo(v1, v2);
}
return coder == LATIN1 ? StringLatin1.compareToUTF16(v1, v2)
: StringUTF16.compareToLatin1(v1, v2);
}
A Comparator that orders String
objects as by compareToIgnoreCase
. This comparator is serializable.
Note that this Comparator does not take locale into account, and will result in an unsatisfactory ordering for certain locales. The Collator
class provides locale-sensitive comparison.
See Also: Since: 1.2
/**
* A Comparator that orders {@code String} objects as by
* {@code compareToIgnoreCase}. This comparator is serializable.
* <p>
* Note that this Comparator does <em>not</em> take locale into account,
* and will result in an unsatisfactory ordering for certain locales.
* The {@link java.text.Collator} class provides locale-sensitive comparison.
*
* @see java.text.Collator
* @since 1.2
*/
public static final Comparator<String> CASE_INSENSITIVE_ORDER
= new CaseInsensitiveComparator();
private static class CaseInsensitiveComparator
implements Comparator<String>, java.io.Serializable {
// use serialVersionUID from JDK 1.2.2 for interoperability
@java.io.Serial
private static final long serialVersionUID = 8575799808933029326L;
public int compare(String s1, String s2) {
byte v1[] = s1.value;
byte v2[] = s2.value;
byte coder = s1.coder();
if (coder == s2.coder()) {
return coder == LATIN1 ? StringLatin1.compareToCI(v1, v2)
: StringUTF16.compareToCI(v1, v2);
}
return coder == LATIN1 ? StringLatin1.compareToCI_UTF16(v1, v2)
: StringUTF16.compareToCI_Latin1(v1, v2);
}
Replaces the de-serialized object. /** Replaces the de-serialized object. */
@java.io.Serial
private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
}
Compares two strings lexicographically, ignoring case differences. This method returns an integer whose sign is that of calling compareTo
with normalized versions of the strings where case differences have been eliminated by calling Character.toLowerCase(Character.toUpperCase(character))
on each character.
Note that this method does not take locale into account, and will result in an unsatisfactory ordering for certain locales. The Collator
class provides locale-sensitive comparison.
Params: - str – the
String
to be compared.
See Also: Returns: a negative integer, zero, or a positive integer as the
specified String is greater than, equal to, or less
than this String, ignoring case considerations. Since: 1.2
/**
* Compares two strings lexicographically, ignoring case
* differences. This method returns an integer whose sign is that of
* calling {@code compareTo} with normalized versions of the strings
* where case differences have been eliminated by calling
* {@code Character.toLowerCase(Character.toUpperCase(character))} on
* each character.
* <p>
* Note that this method does <em>not</em> take locale into account,
* and will result in an unsatisfactory ordering for certain locales.
* The {@link java.text.Collator} class provides locale-sensitive comparison.
*
* @param str the {@code String} to be compared.
* @return a negative integer, zero, or a positive integer as the
* specified String is greater than, equal to, or less
* than this String, ignoring case considerations.
* @see java.text.Collator
* @since 1.2
*/
public int compareToIgnoreCase(String str) {
return CASE_INSENSITIVE_ORDER.compare(this, str);
}
Tests if two string regions are equal.
A substring of this String
object is compared to a substring of the argument other. The result is true if these substrings represent identical character sequences. The substring of this String
object to be compared begins at index toffset
and has length len
. The substring of other to be compared begins at index ooffset
and has length len
. The result is false
if and only if at least one of the following is true:
toffset
is negative. ooffset
is negative. toffset+len
is greater than the length of this String
object. ooffset+len
is greater than the length of the other argument. - There is some nonnegative integer k less than
len
such that: this.charAt(toffset +
k) != other.charAt(ooffset +
k)
Note that this method does not take locale into account. The Collator
class provides locale-sensitive comparison.
Params: - toffset – the starting offset of the subregion in this string.
- other – the string argument.
- ooffset – the starting offset of the subregion in the string
argument.
- len – the number of characters to compare.
Returns: true
if the specified subregion of this string exactly matches the specified subregion of the string argument; false
otherwise.
/**
* Tests if two string regions are equal.
* <p>
* A substring of this {@code String} object is compared to a substring
* of the argument other. The result is true if these substrings
* represent identical character sequences. The substring of this
* {@code String} object to be compared begins at index {@code toffset}
* and has length {@code len}. The substring of other to be compared
* begins at index {@code ooffset} and has length {@code len}. The
* result is {@code false} if and only if at least one of the following
* is true:
* <ul><li>{@code toffset} is negative.
* <li>{@code ooffset} is negative.
* <li>{@code toffset+len} is greater than the length of this
* {@code String} object.
* <li>{@code ooffset+len} is greater than the length of the other
* argument.
* <li>There is some nonnegative integer <i>k</i> less than {@code len}
* such that:
* {@code this.charAt(toffset + }<i>k</i>{@code ) != other.charAt(ooffset + }
* <i>k</i>{@code )}
* </ul>
*
* <p>Note that this method does <em>not</em> take locale into account. The
* {@link java.text.Collator} class provides locale-sensitive comparison.
*
* @param toffset the starting offset of the subregion in this string.
* @param other the string argument.
* @param ooffset the starting offset of the subregion in the string
* argument.
* @param len the number of characters to compare.
* @return {@code true} if the specified subregion of this string
* exactly matches the specified subregion of the string argument;
* {@code false} otherwise.
*/
public boolean regionMatches(int toffset, String other, int ooffset, int len) {
byte tv[] = value;
byte ov[] = other.value;
// Note: toffset, ooffset, or len might be near -1>>>1.
if ((ooffset < 0) || (toffset < 0) ||
(toffset > (long)length() - len) ||
(ooffset > (long)other.length() - len)) {
return false;
}
byte coder = coder();
if (coder == other.coder()) {
if (!isLatin1() && (len > 0)) {
toffset = toffset << 1;
ooffset = ooffset << 1;
len = len << 1;
}
while (len-- > 0) {
if (tv[toffset++] != ov[ooffset++]) {
return false;
}
}
} else {
if (coder == LATIN1) {
while (len-- > 0) {
if (StringLatin1.getChar(tv, toffset++) !=
StringUTF16.getChar(ov, ooffset++)) {
return false;
}
}
} else {
while (len-- > 0) {
if (StringUTF16.getChar(tv, toffset++) !=
StringLatin1.getChar(ov, ooffset++)) {
return false;
}
}
}
}
return true;
}
Tests if two string regions are equal.
A substring of this String
object is compared to a substring of the argument other
. The result is true
if these substrings represent character sequences that are the same, ignoring case if and only if ignoreCase
is true. The substring of this String
object to be compared begins at index toffset
and has length len
. The substring of other
to be compared begins at index ooffset
and has length len
. The result is false
if and only if at least one of the following is true:
toffset
is negative. ooffset
is negative. toffset+len
is greater than the length of this String
object. ooffset+len
is greater than the length of the other argument. ignoreCase
is false
and there is some nonnegative integer k less than len
such that: this.charAt(toffset+k) != other.charAt(ooffset+k)
ignoreCase
is true
and there is some nonnegative integer k less than len
such that: Character.toLowerCase(Character.toUpperCase(this.charAt(toffset+k))) !=
Character.toLowerCase(Character.toUpperCase(other.charAt(ooffset+k)))
Note that this method does not take locale into account, and will result in unsatisfactory results for certain locales when ignoreCase
is true
. The Collator
class provides locale-sensitive comparison.
Params: - ignoreCase – if
true
, ignore case when comparing characters. - toffset – the starting offset of the subregion in this
string.
- other – the string argument.
- ooffset – the starting offset of the subregion in the string
argument.
- len – the number of characters to compare.
Returns: true
if the specified subregion of this string matches the specified subregion of the string argument; false
otherwise. Whether the matching is exact or case insensitive depends on the ignoreCase
argument.
/**
* Tests if two string regions are equal.
* <p>
* A substring of this {@code String} object is compared to a substring
* of the argument {@code other}. The result is {@code true} if these
* substrings represent character sequences that are the same, ignoring
* case if and only if {@code ignoreCase} is true. The substring of
* this {@code String} object to be compared begins at index
* {@code toffset} and has length {@code len}. The substring of
* {@code other} to be compared begins at index {@code ooffset} and
* has length {@code len}. The result is {@code false} if and only if
* at least one of the following is true:
* <ul><li>{@code toffset} is negative.
* <li>{@code ooffset} is negative.
* <li>{@code toffset+len} is greater than the length of this
* {@code String} object.
* <li>{@code ooffset+len} is greater than the length of the other
* argument.
* <li>{@code ignoreCase} is {@code false} and there is some nonnegative
* integer <i>k</i> less than {@code len} such that:
* <blockquote><pre>
* this.charAt(toffset+k) != other.charAt(ooffset+k)
* </pre></blockquote>
* <li>{@code ignoreCase} is {@code true} and there is some nonnegative
* integer <i>k</i> less than {@code len} such that:
* <blockquote><pre>
* Character.toLowerCase(Character.toUpperCase(this.charAt(toffset+k))) !=
Character.toLowerCase(Character.toUpperCase(other.charAt(ooffset+k)))
* </pre></blockquote>
* </ul>
*
* <p>Note that this method does <em>not</em> take locale into account,
* and will result in unsatisfactory results for certain locales when
* {@code ignoreCase} is {@code true}. The {@link java.text.Collator} class
* provides locale-sensitive comparison.
*
* @param ignoreCase if {@code true}, ignore case when comparing
* characters.
* @param toffset the starting offset of the subregion in this
* string.
* @param other the string argument.
* @param ooffset the starting offset of the subregion in the string
* argument.
* @param len the number of characters to compare.
* @return {@code true} if the specified subregion of this string
* matches the specified subregion of the string argument;
* {@code false} otherwise. Whether the matching is exact
* or case insensitive depends on the {@code ignoreCase}
* argument.
*/
public boolean regionMatches(boolean ignoreCase, int toffset,
String other, int ooffset, int len) {
if (!ignoreCase) {
return regionMatches(toffset, other, ooffset, len);
}
// Note: toffset, ooffset, or len might be near -1>>>1.
if ((ooffset < 0) || (toffset < 0)
|| (toffset > (long)length() - len)
|| (ooffset > (long)other.length() - len)) {
return false;
}
byte tv[] = value;
byte ov[] = other.value;
byte coder = coder();
if (coder == other.coder()) {
return coder == LATIN1
? StringLatin1.regionMatchesCI(tv, toffset, ov, ooffset, len)
: StringUTF16.regionMatchesCI(tv, toffset, ov, ooffset, len);
}
return coder == LATIN1
? StringLatin1.regionMatchesCI_UTF16(tv, toffset, ov, ooffset, len)
: StringUTF16.regionMatchesCI_Latin1(tv, toffset, ov, ooffset, len);
}
Tests if the substring of this string beginning at the
specified index starts with the specified prefix.
Params: - prefix – the prefix.
- toffset – where to begin looking in this string.
Returns: true
if the character sequence represented by the argument is a prefix of the substring of this object starting at index toffset
; false
otherwise. The result is false
if toffset
is negative or greater than the length of this String
object; otherwise the result is the same as the result of the expression this.substring(toffset).startsWith(prefix)
/**
* Tests if the substring of this string beginning at the
* specified index starts with the specified prefix.
*
* @param prefix the prefix.
* @param toffset where to begin looking in this string.
* @return {@code true} if the character sequence represented by the
* argument is a prefix of the substring of this object starting
* at index {@code toffset}; {@code false} otherwise.
* The result is {@code false} if {@code toffset} is
* negative or greater than the length of this
* {@code String} object; otherwise the result is the same
* as the result of the expression
* <pre>
* this.substring(toffset).startsWith(prefix)
* </pre>
*/
public boolean startsWith(String prefix, int toffset) {
// Note: toffset might be near -1>>>1.
if (toffset < 0 || toffset > length() - prefix.length()) {
return false;
}
byte ta[] = value;
byte pa[] = prefix.value;
int po = 0;
int pc = pa.length;
byte coder = coder();
if (coder == prefix.coder()) {
int to = (coder == LATIN1) ? toffset : toffset << 1;
while (po < pc) {
if (ta[to++] != pa[po++]) {
return false;
}
}
} else {
if (coder == LATIN1) { // && pcoder == UTF16
return false;
}
// coder == UTF16 && pcoder == LATIN1)
while (po < pc) {
if (StringUTF16.getChar(ta, toffset++) != (pa[po++] & 0xff)) {
return false;
}
}
}
return true;
}
Tests if this string starts with the specified prefix.
Params: - prefix – the prefix.
Returns: true
if the character sequence represented by the argument is a prefix of the character sequence represented by this string; false
otherwise. Note also that true
will be returned if the argument is an empty string or is equal to this String
object as determined by the equals(Object)
method. Since: 1.0
/**
* Tests if this string starts with the specified prefix.
*
* @param prefix the prefix.
* @return {@code true} if the character sequence represented by the
* argument is a prefix of the character sequence represented by
* this string; {@code false} otherwise.
* Note also that {@code true} will be returned if the
* argument is an empty string or is equal to this
* {@code String} object as determined by the
* {@link #equals(Object)} method.
* @since 1.0
*/
public boolean startsWith(String prefix) {
return startsWith(prefix, 0);
}
Tests if this string ends with the specified suffix.
Params: - suffix – the suffix.
Returns: true
if the character sequence represented by the argument is a suffix of the character sequence represented by this object; false
otherwise. Note that the result will be true
if the argument is the empty string or is equal to this String
object as determined by the equals(Object)
method.
/**
* Tests if this string ends with the specified suffix.
*
* @param suffix the suffix.
* @return {@code true} if the character sequence represented by the
* argument is a suffix of the character sequence represented by
* this object; {@code false} otherwise. Note that the
* result will be {@code true} if the argument is the
* empty string or is equal to this {@code String} object
* as determined by the {@link #equals(Object)} method.
*/
public boolean endsWith(String suffix) {
return startsWith(suffix, length() - suffix.length());
}
Returns a hash code for this string. The hash code for a String
object is computed as s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int
arithmetic, where s[i]
is the ith character of the string, n
is the length of the string, and ^
indicates exponentiation. (The hash value of the empty string is zero.) Returns: a hash code value for this object.
/**
* Returns a hash code for this string. The hash code for a
* {@code String} object is computed as
* <blockquote><pre>
* s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
* </pre></blockquote>
* using {@code int} arithmetic, where {@code s[i]} is the
* <i>i</i>th character of the string, {@code n} is the length of
* the string, and {@code ^} indicates exponentiation.
* (The hash value of the empty string is zero.)
*
* @return a hash code value for this object.
*/
public int hashCode() {
// The hash or hashIsZero fields are subject to a benign data race,
// making it crucial to ensure that any observable result of the
// calculation in this method stays correct under any possible read of
// these fields. Necessary restrictions to allow this to be correct
// without explicit memory fences or similar concurrency primitives is
// that we can ever only write to one of these two fields for a given
// String instance, and that the computation is idempotent and derived
// from immutable state
int h = hash;
if (h == 0 && !hashIsZero) {
h = isLatin1() ? StringLatin1.hashCode(value)
: StringUTF16.hashCode(value);
if (h == 0) {
hashIsZero = true;
} else {
hash = h;
}
}
return h;
}
Returns the index within this string of the first occurrence of the specified character. If a character with value ch
occurs in the character sequence represented by this String
object, then the index (in Unicode code units) of the first such occurrence is returned. For values of ch
in the range from 0 to 0xFFFF (inclusive), this is the smallest value k such that:
this.charAt(k) == ch
is true. For other values of ch
, it is the smallest value k such that:
this.codePointAt(k) == ch
is true. In either case, if no such character occurs in this string, then -1
is returned. Params: - ch – a character (Unicode code point).
Returns: the index of the first occurrence of the character in the character sequence represented by this object, or -1
if the character does not occur.
/**
* Returns the index within this string of the first occurrence of
* the specified character. If a character with value
* {@code ch} occurs in the character sequence represented by
* this {@code String} object, then the index (in Unicode
* code units) of the first such occurrence is returned. For
* values of {@code ch} in the range from 0 to 0xFFFF
* (inclusive), this is the smallest value <i>k</i> such that:
* <blockquote><pre>
* this.charAt(<i>k</i>) == ch
* </pre></blockquote>
* is true. For other values of {@code ch}, it is the
* smallest value <i>k</i> such that:
* <blockquote><pre>
* this.codePointAt(<i>k</i>) == ch
* </pre></blockquote>
* is true. In either case, if no such character occurs in this
* string, then {@code -1} is returned.
*
* @param ch a character (Unicode code point).
* @return the index of the first occurrence of the character in the
* character sequence represented by this object, or
* {@code -1} if the character does not occur.
*/
public int indexOf(int ch) {
return indexOf(ch, 0);
}
Returns the index within this string of the first occurrence of the
specified character, starting the search at the specified index.
If a character with value ch
occurs in the character sequence represented by this String
object at an index no smaller than fromIndex
, then the index of the first such occurrence is returned. For values of ch
in the range from 0 to 0xFFFF (inclusive), this is the smallest value k such that:
(this.charAt(k) == ch) &&
(k >= fromIndex)
is true. For other values of ch
, it is the smallest value k such that:
(this.codePointAt(k) == ch) &&
(k >= fromIndex)
is true. In either case, if no such character occurs in this string at or after position fromIndex
, then -1
is returned. There is no restriction on the value of fromIndex
. If it is negative, it has the same effect as if it were zero: this entire string may be searched. If it is greater than the length of this string, it has the same effect as if it were equal to the length of this string: -1
is returned.
All indices are specified in char
values (Unicode code units).
Params: - ch – a character (Unicode code point).
- fromIndex – the index to start the search from.
Returns: the index of the first occurrence of the character in the character sequence represented by this object that is greater than or equal to fromIndex
, or -1
if the character does not occur.
/**
* Returns the index within this string of the first occurrence of the
* specified character, starting the search at the specified index.
* <p>
* If a character with value {@code ch} occurs in the
* character sequence represented by this {@code String}
* object at an index no smaller than {@code fromIndex}, then
* the index of the first such occurrence is returned. For values
* of {@code ch} in the range from 0 to 0xFFFF (inclusive),
* this is the smallest value <i>k</i> such that:
* <blockquote><pre>
* (this.charAt(<i>k</i>) == ch) {@code &&} (<i>k</i> >= fromIndex)
* </pre></blockquote>
* is true. For other values of {@code ch}, it is the
* smallest value <i>k</i> such that:
* <blockquote><pre>
* (this.codePointAt(<i>k</i>) == ch) {@code &&} (<i>k</i> >= fromIndex)
* </pre></blockquote>
* is true. In either case, if no such character occurs in this
* string at or after position {@code fromIndex}, then
* {@code -1} is returned.
*
* <p>
* There is no restriction on the value of {@code fromIndex}. If it
* is negative, it has the same effect as if it were zero: this entire
* string may be searched. If it is greater than the length of this
* string, it has the same effect as if it were equal to the length of
* this string: {@code -1} is returned.
*
* <p>All indices are specified in {@code char} values
* (Unicode code units).
*
* @param ch a character (Unicode code point).
* @param fromIndex the index to start the search from.
* @return the index of the first occurrence of the character in the
* character sequence represented by this object that is greater
* than or equal to {@code fromIndex}, or {@code -1}
* if the character does not occur.
*/
public int indexOf(int ch, int fromIndex) {
return isLatin1() ? StringLatin1.indexOf(value, ch, fromIndex)
: StringUTF16.indexOf(value, ch, fromIndex);
}
Returns the index within this string of the last occurrence of the specified character. For values of ch
in the range from 0 to 0xFFFF (inclusive), the index (in Unicode code units) returned is the largest value k such that:
this.charAt(k) == ch
is true. For other values of ch
, it is the largest value k such that:
this.codePointAt(k) == ch
is true. In either case, if no such character occurs in this string, then -1
is returned. The String
is searched backwards starting at the last character. Params: - ch – a character (Unicode code point).
Returns: the index of the last occurrence of the character in the character sequence represented by this object, or -1
if the character does not occur.
/**
* Returns the index within this string of the last occurrence of
* the specified character. For values of {@code ch} in the
* range from 0 to 0xFFFF (inclusive), the index (in Unicode code
* units) returned is the largest value <i>k</i> such that:
* <blockquote><pre>
* this.charAt(<i>k</i>) == ch
* </pre></blockquote>
* is true. For other values of {@code ch}, it is the
* largest value <i>k</i> such that:
* <blockquote><pre>
* this.codePointAt(<i>k</i>) == ch
* </pre></blockquote>
* is true. In either case, if no such character occurs in this
* string, then {@code -1} is returned. The
* {@code String} is searched backwards starting at the last
* character.
*
* @param ch a character (Unicode code point).
* @return the index of the last occurrence of the character in the
* character sequence represented by this object, or
* {@code -1} if the character does not occur.
*/
public int lastIndexOf(int ch) {
return lastIndexOf(ch, length() - 1);
}
Returns the index within this string of the last occurrence of the specified character, searching backward starting at the specified index. For values of ch
in the range from 0 to 0xFFFF (inclusive), the index returned is the largest value k such that:
(this.charAt(k) == ch) &&
(k <= fromIndex)
is true. For other values of ch
, it is the largest value k such that:
(this.codePointAt(k) == ch) &&
(k <= fromIndex)
is true. In either case, if no such character occurs in this string at or before position fromIndex
, then -1
is returned. All indices are specified in char
values (Unicode code units).
Params: - ch – a character (Unicode code point).
- fromIndex – the index to start the search from. There is no restriction on the value of
fromIndex
. If it is greater than or equal to the length of this string, it has the same effect as if it were equal to one less than the length of this string: this entire string may be searched. If it is negative, it has the same effect as if it were -1: -1 is returned.
Returns: the index of the last occurrence of the character in the character sequence represented by this object that is less than or equal to fromIndex
, or -1
if the character does not occur before that point.
/**
* Returns the index within this string of the last occurrence of
* the specified character, searching backward starting at the
* specified index. For values of {@code ch} in the range
* from 0 to 0xFFFF (inclusive), the index returned is the largest
* value <i>k</i> such that:
* <blockquote><pre>
* (this.charAt(<i>k</i>) == ch) {@code &&} (<i>k</i> <= fromIndex)
* </pre></blockquote>
* is true. For other values of {@code ch}, it is the
* largest value <i>k</i> such that:
* <blockquote><pre>
* (this.codePointAt(<i>k</i>) == ch) {@code &&} (<i>k</i> <= fromIndex)
* </pre></blockquote>
* is true. In either case, if no such character occurs in this
* string at or before position {@code fromIndex}, then
* {@code -1} is returned.
*
* <p>All indices are specified in {@code char} values
* (Unicode code units).
*
* @param ch a character (Unicode code point).
* @param fromIndex the index to start the search from. There is no
* restriction on the value of {@code fromIndex}. If it is
* greater than or equal to the length of this string, it has
* the same effect as if it were equal to one less than the
* length of this string: this entire string may be searched.
* If it is negative, it has the same effect as if it were -1:
* -1 is returned.
* @return the index of the last occurrence of the character in the
* character sequence represented by this object that is less
* than or equal to {@code fromIndex}, or {@code -1}
* if the character does not occur before that point.
*/
public int lastIndexOf(int ch, int fromIndex) {
return isLatin1() ? StringLatin1.lastIndexOf(value, ch, fromIndex)
: StringUTF16.lastIndexOf(value, ch, fromIndex);
}
Returns the index within this string of the first occurrence of the
specified substring.
The returned index is the smallest value k
for which:
this.startsWith(str, k)
If no such value of k
exists, then -1
is returned. Params: - str – the substring to search for.
Returns: the index of the first occurrence of the specified substring, or -1
if there is no such occurrence.
/**
* Returns the index within this string of the first occurrence of the
* specified substring.
*
* <p>The returned index is the smallest value {@code k} for which:
* <pre>{@code
* this.startsWith(str, k)
* }</pre>
* If no such value of {@code k} exists, then {@code -1} is returned.
*
* @param str the substring to search for.
* @return the index of the first occurrence of the specified substring,
* or {@code -1} if there is no such occurrence.
*/
public int indexOf(String str) {
byte coder = coder();
if (coder == str.coder()) {
return isLatin1() ? StringLatin1.indexOf(value, str.value)
: StringUTF16.indexOf(value, str.value);
}
if (coder == LATIN1) { // str.coder == UTF16
return -1;
}
return StringUTF16.indexOfLatin1(value, str.value);
}
Returns the index within this string of the first occurrence of the
specified substring, starting at the specified index.
The returned index is the smallest value k
for which:
k >= Math.min(fromIndex, this.length()) &&
this.startsWith(str, k)
If no such value of k
exists, then -1
is returned. Params: - str – the substring to search for.
- fromIndex – the index from which to start the search.
Returns: the index of the first occurrence of the specified substring, starting at the specified index, or -1
if there is no such occurrence.
/**
* Returns the index within this string of the first occurrence of the
* specified substring, starting at the specified index.
*
* <p>The returned index is the smallest value {@code k} for which:
* <pre>{@code
* k >= Math.min(fromIndex, this.length()) &&
* this.startsWith(str, k)
* }</pre>
* If no such value of {@code k} exists, then {@code -1} is returned.
*
* @param str the substring to search for.
* @param fromIndex the index from which to start the search.
* @return the index of the first occurrence of the specified substring,
* starting at the specified index,
* or {@code -1} if there is no such occurrence.
*/
public int indexOf(String str, int fromIndex) {
return indexOf(value, coder(), length(), str, fromIndex);
}
Code shared by String and AbstractStringBuilder to do searches. The
source is the character array being searched, and the target
is the string being searched for.
Params: - src – the characters being searched.
- srcCoder – the coder of the source string.
- srcCount – length of the source string.
- tgtStr – the characters being searched for.
- fromIndex – the index to begin searching from.
/**
* Code shared by String and AbstractStringBuilder to do searches. The
* source is the character array being searched, and the target
* is the string being searched for.
*
* @param src the characters being searched.
* @param srcCoder the coder of the source string.
* @param srcCount length of the source string.
* @param tgtStr the characters being searched for.
* @param fromIndex the index to begin searching from.
*/
static int indexOf(byte[] src, byte srcCoder, int srcCount,
String tgtStr, int fromIndex) {
byte[] tgt = tgtStr.value;
byte tgtCoder = tgtStr.coder();
int tgtCount = tgtStr.length();
if (fromIndex >= srcCount) {
return (tgtCount == 0 ? srcCount : -1);
}
if (fromIndex < 0) {
fromIndex = 0;
}
if (tgtCount == 0) {
return fromIndex;
}
if (tgtCount > srcCount) {
return -1;
}
if (srcCoder == tgtCoder) {
return srcCoder == LATIN1
? StringLatin1.indexOf(src, srcCount, tgt, tgtCount, fromIndex)
: StringUTF16.indexOf(src, srcCount, tgt, tgtCount, fromIndex);
}
if (srcCoder == LATIN1) { // && tgtCoder == UTF16
return -1;
}
// srcCoder == UTF16 && tgtCoder == LATIN1) {
return StringUTF16.indexOfLatin1(src, srcCount, tgt, tgtCount, fromIndex);
}
Returns the index within this string of the last occurrence of the specified substring. The last occurrence of the empty string "" is considered to occur at the index value this.length()
. The returned index is the largest value k
for which:
this.startsWith(str, k)
If no such value of k
exists, then -1
is returned. Params: - str – the substring to search for.
Returns: the index of the last occurrence of the specified substring, or -1
if there is no such occurrence.
/**
* Returns the index within this string of the last occurrence of the
* specified substring. The last occurrence of the empty string ""
* is considered to occur at the index value {@code this.length()}.
*
* <p>The returned index is the largest value {@code k} for which:
* <pre>{@code
* this.startsWith(str, k)
* }</pre>
* If no such value of {@code k} exists, then {@code -1} is returned.
*
* @param str the substring to search for.
* @return the index of the last occurrence of the specified substring,
* or {@code -1} if there is no such occurrence.
*/
public int lastIndexOf(String str) {
return lastIndexOf(str, length());
}
Returns the index within this string of the last occurrence of the
specified substring, searching backward starting at the specified index.
The returned index is the largest value k
for which:
k <= Math.min(fromIndex, this.length()) &&
this.startsWith(str, k)
If no such value of k
exists, then -1
is returned. Params: - str – the substring to search for.
- fromIndex – the index to start the search from.
Returns: the index of the last occurrence of the specified substring, searching backward from the specified index, or -1
if there is no such occurrence.
/**
* Returns the index within this string of the last occurrence of the
* specified substring, searching backward starting at the specified index.
*
* <p>The returned index is the largest value {@code k} for which:
* <pre>{@code
* k <= Math.min(fromIndex, this.length()) &&
* this.startsWith(str, k)
* }</pre>
* If no such value of {@code k} exists, then {@code -1} is returned.
*
* @param str the substring to search for.
* @param fromIndex the index to start the search from.
* @return the index of the last occurrence of the specified substring,
* searching backward from the specified index,
* or {@code -1} if there is no such occurrence.
*/
public int lastIndexOf(String str, int fromIndex) {
return lastIndexOf(value, coder(), length(), str, fromIndex);
}
Code shared by String and AbstractStringBuilder to do searches. The
source is the character array being searched, and the target
is the string being searched for.
Params: - src – the characters being searched.
- srcCoder – coder handles the mapping between bytes/chars
- srcCount – count of the source string.
- tgt – the characters being searched for.
- fromIndex – the index to begin searching from.
/**
* Code shared by String and AbstractStringBuilder to do searches. The
* source is the character array being searched, and the target
* is the string being searched for.
*
* @param src the characters being searched.
* @param srcCoder coder handles the mapping between bytes/chars
* @param srcCount count of the source string.
* @param tgt the characters being searched for.
* @param fromIndex the index to begin searching from.
*/
static int lastIndexOf(byte[] src, byte srcCoder, int srcCount,
String tgtStr, int fromIndex) {
byte[] tgt = tgtStr.value;
byte tgtCoder = tgtStr.coder();
int tgtCount = tgtStr.length();
/*
* Check arguments; return immediately where possible. For
* consistency, don't check for null str.
*/
int rightIndex = srcCount - tgtCount;
if (fromIndex > rightIndex) {
fromIndex = rightIndex;
}
if (fromIndex < 0) {
return -1;
}
/* Empty string always matches. */
if (tgtCount == 0) {
return fromIndex;
}
if (srcCoder == tgtCoder) {
return srcCoder == LATIN1
? StringLatin1.lastIndexOf(src, srcCount, tgt, tgtCount, fromIndex)
: StringUTF16.lastIndexOf(src, srcCount, tgt, tgtCount, fromIndex);
}
if (srcCoder == LATIN1) { // && tgtCoder == UTF16
return -1;
}
// srcCoder == UTF16 && tgtCoder == LATIN1
return StringUTF16.lastIndexOfLatin1(src, srcCount, tgt, tgtCount, fromIndex);
}
Returns a string that is a substring of this string. The
substring begins with the character at the specified index and
extends to the end of this string.
Examples:
"unhappy".substring(2) returns "happy"
"Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an empty string)
Params: - beginIndex – the beginning index, inclusive.
Throws: - IndexOutOfBoundsException – if
beginIndex
is negative or larger than the length of this String
object.
Returns: the specified substring.
/**
* Returns a string that is a substring of this string. The
* substring begins with the character at the specified index and
* extends to the end of this string. <p>
* Examples:
* <blockquote><pre>
* "unhappy".substring(2) returns "happy"
* "Harbison".substring(3) returns "bison"
* "emptiness".substring(9) returns "" (an empty string)
* </pre></blockquote>
*
* @param beginIndex the beginning index, inclusive.
* @return the specified substring.
* @throws IndexOutOfBoundsException if
* {@code beginIndex} is negative or larger than the
* length of this {@code String} object.
*/
public String substring(int beginIndex) {
return substring(beginIndex, length());
}
Returns a string that is a substring of this string. The substring begins at the specified beginIndex
and extends to the character at index endIndex - 1
. Thus the length of the substring is endIndex-beginIndex
.
Examples:
"hamburger".substring(4, 8) returns "urge"
"smiles".substring(1, 5) returns "mile"
Params: - beginIndex – the beginning index, inclusive.
- endIndex – the ending index, exclusive.
Throws: - IndexOutOfBoundsException – if the
beginIndex
is negative, or endIndex
is larger than the length of this String
object, or beginIndex
is larger than endIndex
.
Returns: the specified substring.
/**
* Returns a string that is a substring of this string. The
* substring begins at the specified {@code beginIndex} and
* extends to the character at index {@code endIndex - 1}.
* Thus the length of the substring is {@code endIndex-beginIndex}.
* <p>
* Examples:
* <blockquote><pre>
* "hamburger".substring(4, 8) returns "urge"
* "smiles".substring(1, 5) returns "mile"
* </pre></blockquote>
*
* @param beginIndex the beginning index, inclusive.
* @param endIndex the ending index, exclusive.
* @return the specified substring.
* @throws IndexOutOfBoundsException if the
* {@code beginIndex} is negative, or
* {@code endIndex} is larger than the length of
* this {@code String} object, or
* {@code beginIndex} is larger than
* {@code endIndex}.
*/
public String substring(int beginIndex, int endIndex) {
int length = length();
checkBoundsBeginEnd(beginIndex, endIndex, length);
int subLen = endIndex - beginIndex;
if (beginIndex == 0 && endIndex == length) {
return this;
}
return isLatin1() ? StringLatin1.newString(value, beginIndex, subLen)
: StringUTF16.newString(value, beginIndex, subLen);
}
Returns a character sequence that is a subsequence of this sequence.
An invocation of this method of the form
str.subSequence(begin, end)
behaves in exactly the same way as the invocation
str.substring(begin, end)
Params: - beginIndex – the begin index, inclusive.
- endIndex – the end index, exclusive.
Throws: - IndexOutOfBoundsException – if
beginIndex
or endIndex
is negative, if endIndex
is greater than length()
, or if beginIndex
is greater than endIndex
API Note: This method is defined so that the String
class can implement the CharSequence
interface. Returns: the specified subsequence. Since: 1.4 @spec JSR-51
/**
* Returns a character sequence that is a subsequence of this sequence.
*
* <p> An invocation of this method of the form
*
* <blockquote><pre>
* str.subSequence(begin, end)</pre></blockquote>
*
* behaves in exactly the same way as the invocation
*
* <blockquote><pre>
* str.substring(begin, end)</pre></blockquote>
*
* @apiNote
* This method is defined so that the {@code String} class can implement
* the {@link CharSequence} interface.
*
* @param beginIndex the begin index, inclusive.
* @param endIndex the end index, exclusive.
* @return the specified subsequence.
*
* @throws IndexOutOfBoundsException
* if {@code beginIndex} or {@code endIndex} is negative,
* if {@code endIndex} is greater than {@code length()},
* or if {@code beginIndex} is greater than {@code endIndex}
*
* @since 1.4
* @spec JSR-51
*/
public CharSequence subSequence(int beginIndex, int endIndex) {
return this.substring(beginIndex, endIndex);
}
Concatenates the specified string to the end of this string.
If the length of the argument string is 0
, then this String
object is returned. Otherwise, a String
object is returned that represents a character sequence that is the concatenation of the character sequence represented by this String
object and the character sequence represented by the argument string.
Examples:
"cares".concat("s") returns "caress"
"to".concat("get").concat("her") returns "together"
Params: - str – the
String
that is concatenated to the end of this String
.
Returns: a string that represents the concatenation of this object's
characters followed by the string argument's characters.
/**
* Concatenates the specified string to the end of this string.
* <p>
* If the length of the argument string is {@code 0}, then this
* {@code String} object is returned. Otherwise, a
* {@code String} object is returned that represents a character
* sequence that is the concatenation of the character sequence
* represented by this {@code String} object and the character
* sequence represented by the argument string.<p>
* Examples:
* <blockquote><pre>
* "cares".concat("s") returns "caress"
* "to".concat("get").concat("her") returns "together"
* </pre></blockquote>
*
* @param str the {@code String} that is concatenated to the end
* of this {@code String}.
* @return a string that represents the concatenation of this object's
* characters followed by the string argument's characters.
*/
public String concat(String str) {
if (str.isEmpty()) {
return this;
}
return StringConcatHelper.simpleConcat(this, str);
}
Returns a string resulting from replacing all occurrences of oldChar
in this string with newChar
. If the character oldChar
does not occur in the character sequence represented by this String
object, then a reference to this String
object is returned. Otherwise, a String
object is returned that represents a character sequence identical to the character sequence represented by this String
object, except that every occurrence of oldChar
is replaced by an occurrence of newChar
.
Examples:
"mesquite in your cellar".replace('e', 'o')
returns "mosquito in your collar"
"the war of baronets".replace('r', 'y')
returns "the way of bayonets"
"sparring with a purple porpoise".replace('p', 't')
returns "starring with a turtle tortoise"
"JonL".replace('q', 'x') returns "JonL" (no change)
Params: - oldChar – the old character.
- newChar – the new character.
Returns: a string derived from this string by replacing every occurrence of oldChar
with newChar
.
/**
* Returns a string resulting from replacing all occurrences of
* {@code oldChar} in this string with {@code newChar}.
* <p>
* If the character {@code oldChar} does not occur in the
* character sequence represented by this {@code String} object,
* then a reference to this {@code String} object is returned.
* Otherwise, a {@code String} object is returned that
* represents a character sequence identical to the character sequence
* represented by this {@code String} object, except that every
* occurrence of {@code oldChar} is replaced by an occurrence
* of {@code newChar}.
* <p>
* Examples:
* <blockquote><pre>
* "mesquite in your cellar".replace('e', 'o')
* returns "mosquito in your collar"
* "the war of baronets".replace('r', 'y')
* returns "the way of bayonets"
* "sparring with a purple porpoise".replace('p', 't')
* returns "starring with a turtle tortoise"
* "JonL".replace('q', 'x') returns "JonL" (no change)
* </pre></blockquote>
*
* @param oldChar the old character.
* @param newChar the new character.
* @return a string derived from this string by replacing every
* occurrence of {@code oldChar} with {@code newChar}.
*/
public String replace(char oldChar, char newChar) {
if (oldChar != newChar) {
String ret = isLatin1() ? StringLatin1.replace(value, oldChar, newChar)
: StringUTF16.replace(value, oldChar, newChar);
if (ret != null) {
return ret;
}
}
return this;
}
Tells whether or not this string matches the given regular expression.
An invocation of this method of the form
str.matches(
regex)
yields exactly the same result as the expression
Pattern
.
matches(regex, str)
Params: - regex –
the regular expression to which this string is to be matched
Throws: - PatternSyntaxException –
if the regular expression's syntax is invalid
See Also: Returns: true
if, and only if, this string matches the given regular expression Since: 1.4 @spec JSR-51
/**
* Tells whether or not this string matches the given <a
* href="../util/regex/Pattern.html#sum">regular expression</a>.
*
* <p> An invocation of this method of the form
* <i>str</i>{@code .matches(}<i>regex</i>{@code )} yields exactly the
* same result as the expression
*
* <blockquote>
* {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#matches(String,CharSequence)
* matches(<i>regex</i>, <i>str</i>)}
* </blockquote>
*
* @param regex
* the regular expression to which this string is to be matched
*
* @return {@code true} if, and only if, this string matches the
* given regular expression
*
* @throws PatternSyntaxException
* if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public boolean matches(String regex) {
return Pattern.matches(regex, this);
}
Returns true if and only if this string contains the specified
sequence of char values.
Params: - s – the sequence to search for
Returns: true if this string contains s
, false otherwise Since: 1.5
/**
* Returns true if and only if this string contains the specified
* sequence of char values.
*
* @param s the sequence to search for
* @return true if this string contains {@code s}, false otherwise
* @since 1.5
*/
public boolean contains(CharSequence s) {
return indexOf(s.toString()) >= 0;
}
Replaces the first substring of this string that matches the given regular expression with the
given replacement.
An invocation of this method of the form
str.replaceFirst(
regex,
repl)
yields exactly the same result as the expression
Pattern
.compile
(regex).matcher
(str).replaceFirst
(repl)
Note that backslashes (\
) and dollar signs ($
) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceFirst
. Use Matcher.quoteReplacement
to suppress the special meaning of these characters, if desired.
Params: - regex –
the regular expression to which this string is to be matched
- replacement –
the string to be substituted for the first match
Throws: - PatternSyntaxException –
if the regular expression's syntax is invalid
See Also: Returns: The resulting String
Since: 1.4 @spec JSR-51
/**
* Replaces the first substring of this string that matches the given <a
* href="../util/regex/Pattern.html#sum">regular expression</a> with the
* given replacement.
*
* <p> An invocation of this method of the form
* <i>str</i>{@code .replaceFirst(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
* yields exactly the same result as the expression
*
* <blockquote>
* <code>
* {@link java.util.regex.Pattern}.{@link
* java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
* java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
* java.util.regex.Matcher#replaceFirst replaceFirst}(<i>repl</i>)
* </code>
* </blockquote>
*
*<p>
* Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
* replacement string may cause the results to be different than if it were
* being treated as a literal replacement string; see
* {@link java.util.regex.Matcher#replaceFirst}.
* Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
* meaning of these characters, if desired.
*
* @param regex
* the regular expression to which this string is to be matched
* @param replacement
* the string to be substituted for the first match
*
* @return The resulting {@code String}
*
* @throws PatternSyntaxException
* if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public String replaceFirst(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceFirst(replacement);
}
Replaces each substring of this string that matches the given regular expression with the
given replacement.
An invocation of this method of the form
str.replaceAll(
regex,
repl)
yields exactly the same result as the expression
Pattern
.compile
(regex).matcher
(str).replaceAll
(repl)
Note that backslashes (\
) and dollar signs ($
) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll
. Use Matcher.quoteReplacement
to suppress the special meaning of these characters, if desired.
Params: - regex –
the regular expression to which this string is to be matched
- replacement –
the string to be substituted for each match
Throws: - PatternSyntaxException –
if the regular expression's syntax is invalid
See Also: Returns: The resulting String
Since: 1.4 @spec JSR-51
/**
* Replaces each substring of this string that matches the given <a
* href="../util/regex/Pattern.html#sum">regular expression</a> with the
* given replacement.
*
* <p> An invocation of this method of the form
* <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
* yields exactly the same result as the expression
*
* <blockquote>
* <code>
* {@link java.util.regex.Pattern}.{@link
* java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
* java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
* java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>)
* </code>
* </blockquote>
*
*<p>
* Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
* replacement string may cause the results to be different than if it were
* being treated as a literal replacement string; see
* {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
* Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
* meaning of these characters, if desired.
*
* @param regex
* the regular expression to which this string is to be matched
* @param replacement
* the string to be substituted for each match
*
* @return The resulting {@code String}
*
* @throws PatternSyntaxException
* if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}
Replaces each substring of this string that matches the literal target
sequence with the specified literal replacement sequence. The
replacement proceeds from the beginning of the string to the end, for
example, replacing "aa" with "b" in the string "aaa" will result in
"ba" rather than "ab".
Params: - target – The sequence of char values to be replaced
- replacement – The replacement sequence of char values
Returns: The resulting string Since: 1.5
/**
* Replaces each substring of this string that matches the literal target
* sequence with the specified literal replacement sequence. The
* replacement proceeds from the beginning of the string to the end, for
* example, replacing "aa" with "b" in the string "aaa" will result in
* "ba" rather than "ab".
*
* @param target The sequence of char values to be replaced
* @param replacement The replacement sequence of char values
* @return The resulting string
* @since 1.5
*/
public String replace(CharSequence target, CharSequence replacement) {
String trgtStr = target.toString();
String replStr = replacement.toString();
int thisLen = length();
int trgtLen = trgtStr.length();
int replLen = replStr.length();
if (trgtLen > 0) {
if (trgtLen == 1 && replLen == 1) {
return replace(trgtStr.charAt(0), replStr.charAt(0));
}
boolean thisIsLatin1 = this.isLatin1();
boolean trgtIsLatin1 = trgtStr.isLatin1();
boolean replIsLatin1 = replStr.isLatin1();
String ret = (thisIsLatin1 && trgtIsLatin1 && replIsLatin1)
? StringLatin1.replace(value, thisLen,
trgtStr.value, trgtLen,
replStr.value, replLen)
: StringUTF16.replace(value, thisLen, thisIsLatin1,
trgtStr.value, trgtLen, trgtIsLatin1,
replStr.value, replLen, replIsLatin1);
if (ret != null) {
return ret;
}
return this;
} else { // trgtLen == 0
int resultLen;
try {
resultLen = Math.addExact(thisLen, Math.multiplyExact(
Math.addExact(thisLen, 1), replLen));
} catch (ArithmeticException ignored) {
throw new OutOfMemoryError();
}
StringBuilder sb = new StringBuilder(resultLen);
sb.append(replStr);
for (int i = 0; i < thisLen; ++i) {
sb.append(charAt(i)).append(replStr);
}
return sb.toString();
}
}
Splits this string around matches of the given
regular expression.
The array returned by this method contains each substring of this
string that is terminated by another substring that matches the given
expression or is terminated by the end of the string. The substrings in
the array are in the order in which they occur in this string. If the
expression does not match any part of the input then the resulting array
has just one element, namely this string.
When there is a positive-width match at the beginning of this
string then an empty leading substring is included at the beginning
of the resulting array. A zero-width match at the beginning however
never produces such empty leading substring.
The limit
parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.
If the limit is positive then the pattern will be applied
at most limit - 1 times, the array's length will be
no greater than limit, and the array's last entry will contain
all input beyond the last matched delimiter.
If the limit is zero then the pattern will be applied as
many times as possible, the array can have any length, and trailing
empty strings will be discarded.
If the limit is negative then the pattern will be applied
as many times as possible and the array can have any length.
The string "boo:and:foo"
, for example, yields the following results with these parameters:
Split example showing regex, limit, and result
Regex
Limit
Result
:
2
{ "boo", "and:foo" }
5
{ "boo", "and", "foo" }
-2
{ "boo", "and", "foo" }
o
5
{ "b", "", ":and:f", "", "" }
-2
{ "b", "", ":and:f", "", "" }
0
{ "b", "", ":and:f" }
An invocation of this method of the form
str.split(
regex,
n)
yields the same result as the expression
Pattern
.compile
(regex).split
(str, n)
Params: - regex –
the delimiting regular expression
- limit –
the result threshold, as described above
Throws: - PatternSyntaxException –
if the regular expression's syntax is invalid
See Also: Returns: the array of strings computed by splitting this string
around matches of the given regular expression Since: 1.4 @spec JSR-51
/**
* Splits this string around matches of the given
* <a href="../util/regex/Pattern.html#sum">regular expression</a>.
*
* <p> The array returned by this method contains each substring of this
* string that is terminated by another substring that matches the given
* expression or is terminated by the end of the string. The substrings in
* the array are in the order in which they occur in this string. If the
* expression does not match any part of the input then the resulting array
* has just one element, namely this string.
*
* <p> When there is a positive-width match at the beginning of this
* string then an empty leading substring is included at the beginning
* of the resulting array. A zero-width match at the beginning however
* never produces such empty leading substring.
*
* <p> The {@code limit} parameter controls the number of times the
* pattern is applied and therefore affects the length of the resulting
* array.
* <ul>
* <li><p>
* If the <i>limit</i> is positive then the pattern will be applied
* at most <i>limit</i> - 1 times, the array's length will be
* no greater than <i>limit</i>, and the array's last entry will contain
* all input beyond the last matched delimiter.</p></li>
*
* <li><p>
* If the <i>limit</i> is zero then the pattern will be applied as
* many times as possible, the array can have any length, and trailing
* empty strings will be discarded.</p></li>
*
* <li><p>
* If the <i>limit</i> is negative then the pattern will be applied
* as many times as possible and the array can have any length.</p></li>
* </ul>
*
* <p> The string {@code "boo:and:foo"}, for example, yields the
* following results with these parameters:
*
* <blockquote><table class="plain">
* <caption style="display:none">Split example showing regex, limit, and result</caption>
* <thead>
* <tr>
* <th scope="col">Regex</th>
* <th scope="col">Limit</th>
* <th scope="col">Result</th>
* </tr>
* </thead>
* <tbody>
* <tr><th scope="row" rowspan="3" style="font-weight:normal">:</th>
* <th scope="row" style="font-weight:normal; text-align:right; padding-right:1em">2</th>
* <td>{@code { "boo", "and:foo" }}</td></tr>
* <tr><!-- : -->
* <th scope="row" style="font-weight:normal; text-align:right; padding-right:1em">5</th>
* <td>{@code { "boo", "and", "foo" }}</td></tr>
* <tr><!-- : -->
* <th scope="row" style="font-weight:normal; text-align:right; padding-right:1em">-2</th>
* <td>{@code { "boo", "and", "foo" }}</td></tr>
* <tr><th scope="row" rowspan="3" style="font-weight:normal">o</th>
* <th scope="row" style="font-weight:normal; text-align:right; padding-right:1em">5</th>
* <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
* <tr><!-- o -->
* <th scope="row" style="font-weight:normal; text-align:right; padding-right:1em">-2</th>
* <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
* <tr><!-- o -->
* <th scope="row" style="font-weight:normal; text-align:right; padding-right:1em">0</th>
* <td>{@code { "b", "", ":and:f" }}</td></tr>
* </tbody>
* </table></blockquote>
*
* <p> An invocation of this method of the form
* <i>str.</i>{@code split(}<i>regex</i>{@code ,} <i>n</i>{@code )}
* yields the same result as the expression
*
* <blockquote>
* <code>
* {@link java.util.regex.Pattern}.{@link
* java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
* java.util.regex.Pattern#split(java.lang.CharSequence,int) split}(<i>str</i>, <i>n</i>)
* </code>
* </blockquote>
*
*
* @param regex
* the delimiting regular expression
*
* @param limit
* the result threshold, as described above
*
* @return the array of strings computed by splitting this string
* around matches of the given regular expression
*
* @throws PatternSyntaxException
* if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public String[] split(String regex, int limit) {
/* fastpath if the regex is a
(1)one-char String and this character is not one of the
RegEx's meta characters ".$|()[{^?*+\\", or
(2)two-char String and the first char is the backslash and
the second is not the ascii digit or ascii letter.
*/
char ch = 0;
if (((regex.length() == 1 &&
".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
(regex.length() == 2 &&
regex.charAt(0) == '\\' &&
(((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
((ch-'a')|('z'-ch)) < 0 &&
((ch-'A')|('Z'-ch)) < 0)) &&
(ch < Character.MIN_HIGH_SURROGATE ||
ch > Character.MAX_LOW_SURROGATE))
{
int off = 0;
int next = 0;
boolean limited = limit > 0;
ArrayList<String> list = new ArrayList<>();
while ((next = indexOf(ch, off)) != -1) {
if (!limited || list.size() < limit - 1) {
list.add(substring(off, next));
off = next + 1;
} else { // last one
//assert (list.size() == limit - 1);
int last = length();
list.add(substring(off, last));
off = last;
break;
}
}
// If no match was found, return this
if (off == 0)
return new String[]{this};
// Add remaining segment
if (!limited || list.size() < limit)
list.add(substring(off, length()));
// Construct result
int resultSize = list.size();
if (limit == 0) {
while (resultSize > 0 && list.get(resultSize - 1).isEmpty()) {
resultSize--;
}
}
String[] result = new String[resultSize];
return list.subList(0, resultSize).toArray(result);
}
return Pattern.compile(regex).split(this, limit);
}
Splits this string around matches of the given regular expression.
This method works as if by invoking the two-argument split
method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
The string "boo:and:foo"
, for example, yields the following results with these expressions:
Split examples showing regex and result
Regex
Result
:
{ "boo", "and", "foo" }
o
{ "b", "", ":and:f" }
Params: - regex –
the delimiting regular expression
Throws: - PatternSyntaxException –
if the regular expression's syntax is invalid
See Also: Returns: the array of strings computed by splitting this string
around matches of the given regular expression Since: 1.4 @spec JSR-51
/**
* Splits this string around matches of the given <a
* href="../util/regex/Pattern.html#sum">regular expression</a>.
*
* <p> This method works as if by invoking the two-argument {@link
* #split(String, int) split} method with the given expression and a limit
* argument of zero. Trailing empty strings are therefore not included in
* the resulting array.
*
* <p> The string {@code "boo:and:foo"}, for example, yields the following
* results with these expressions:
*
* <blockquote><table class="plain">
* <caption style="display:none">Split examples showing regex and result</caption>
* <thead>
* <tr>
* <th scope="col">Regex</th>
* <th scope="col">Result</th>
* </tr>
* </thead>
* <tbody>
* <tr><th scope="row" style="text-weight:normal">:</th>
* <td>{@code { "boo", "and", "foo" }}</td></tr>
* <tr><th scope="row" style="text-weight:normal">o</th>
* <td>{@code { "b", "", ":and:f" }}</td></tr>
* </tbody>
* </table></blockquote>
*
*
* @param regex
* the delimiting regular expression
*
* @return the array of strings computed by splitting this string
* around matches of the given regular expression
*
* @throws PatternSyntaxException
* if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public String[] split(String regex) {
return split(regex, 0);
}
Returns a new String composed of copies of the CharSequence elements
joined together with a copy of the specified delimiter
. For example,
String message = String.join("-", "Java", "is", "cool");
// message returned is: "Java-is-cool"
Note that if an element is null, then "null"
is added. Params: - delimiter – the delimiter that separates each element
- elements – the elements to join together.
Throws: - NullPointerException – If
delimiter
or elements
is null
See Also: Returns: a new String
that is composed of the elements
separated by the delimiter
Since: 1.8
/**
* Returns a new String composed of copies of the
* {@code CharSequence elements} joined together with a copy of
* the specified {@code delimiter}.
*
* <blockquote>For example,
* <pre>{@code
* String message = String.join("-", "Java", "is", "cool");
* // message returned is: "Java-is-cool"
* }</pre></blockquote>
*
* Note that if an element is null, then {@code "null"} is added.
*
* @param delimiter the delimiter that separates each element
* @param elements the elements to join together.
*
* @return a new {@code String} that is composed of the {@code elements}
* separated by the {@code delimiter}
*
* @throws NullPointerException If {@code delimiter} or {@code elements}
* is {@code null}
*
* @see java.util.StringJoiner
* @since 1.8
*/
public static String join(CharSequence delimiter, CharSequence... elements) {
Objects.requireNonNull(delimiter);
Objects.requireNonNull(elements);
// Number of elements not likely worth Arrays.stream overhead.
StringJoiner joiner = new StringJoiner(delimiter);
for (CharSequence cs: elements) {
joiner.add(cs);
}
return joiner.toString();
}
Returns a new String
composed of copies of the CharSequence elements
joined together with a copy of the specified delimiter
. For example,
List<String> strings = List.of("Java", "is", "cool");
String message = String.join(" ", strings);
//message returned is: "Java is cool"
Set<String> strings =
new LinkedHashSet<>(List.of("Java", "is", "very", "cool"));
String message = String.join("-", strings);
//message returned is: "Java-is-very-cool"
Note that if an individual element is null
, then "null"
is added. Params: - delimiter – a sequence of characters that is used to separate each of the
elements
in the resulting String
- elements – an
Iterable
that will have its elements
joined together.
Throws: - NullPointerException – If
delimiter
or elements
is null
See Also: Returns: a new String
that is composed from the elements
argument Since: 1.8
/**
* Returns a new {@code String} composed of copies of the
* {@code CharSequence elements} joined together with a copy of the
* specified {@code delimiter}.
*
* <blockquote>For example,
* <pre>{@code
* List<String> strings = List.of("Java", "is", "cool");
* String message = String.join(" ", strings);
* //message returned is: "Java is cool"
*
* Set<String> strings =
* new LinkedHashSet<>(List.of("Java", "is", "very", "cool"));
* String message = String.join("-", strings);
* //message returned is: "Java-is-very-cool"
* }</pre></blockquote>
*
* Note that if an individual element is {@code null}, then {@code "null"} is added.
*
* @param delimiter a sequence of characters that is used to separate each
* of the {@code elements} in the resulting {@code String}
* @param elements an {@code Iterable} that will have its {@code elements}
* joined together.
*
* @return a new {@code String} that is composed from the {@code elements}
* argument
*
* @throws NullPointerException If {@code delimiter} or {@code elements}
* is {@code null}
*
* @see #join(CharSequence,CharSequence...)
* @see java.util.StringJoiner
* @since 1.8
*/
public static String join(CharSequence delimiter,
Iterable<? extends CharSequence> elements) {
Objects.requireNonNull(delimiter);
Objects.requireNonNull(elements);
StringJoiner joiner = new StringJoiner(delimiter);
for (CharSequence cs: elements) {
joiner.add(cs);
}
return joiner.toString();
}
Converts all of the characters in this String
to lower case using the rules of the given Locale
. Case mapping is based on the Unicode Standard version specified by the Character
class. Since case mappings are not always 1:1 char mappings, the resulting String
may be a different length than the original String
.
Examples of lowercase mappings are in the following table:
Lowercase mapping examples showing language code of locale, upper case, lower case, and description
Language Code of Locale
Upper Case
Lower Case
Description
tr (Turkish)
\u0130
\u0069
capital letter I with dot above -> small letter i
tr (Turkish)
\u0049
\u0131
capital letter I -> small letter dotless i
(all)
French Fries
french fries
lowercased all chars in String
(all)
ΙΧΘΥΣ
ιχθυσ
lowercased all chars in String
Params: - locale – use the case transformation rules for this locale
See Also: Returns: the String
, converted to lowercase. Since: 1.1
/**
* Converts all of the characters in this {@code String} to lower
* case using the rules of the given {@code Locale}. Case mapping is based
* on the Unicode Standard version specified by the {@link java.lang.Character Character}
* class. Since case mappings are not always 1:1 char mappings, the resulting
* {@code String} may be a different length than the original {@code String}.
* <p>
* Examples of lowercase mappings are in the following table:
* <table class="plain">
* <caption style="display:none">Lowercase mapping examples showing language code of locale, upper case, lower case, and description</caption>
* <thead>
* <tr>
* <th scope="col">Language Code of Locale</th>
* <th scope="col">Upper Case</th>
* <th scope="col">Lower Case</th>
* <th scope="col">Description</th>
* </tr>
* </thead>
* <tbody>
* <tr>
* <td>tr (Turkish)</td>
* <th scope="row" style="font-weight:normal; text-align:left">\u0130</th>
* <td>\u0069</td>
* <td>capital letter I with dot above -> small letter i</td>
* </tr>
* <tr>
* <td>tr (Turkish)</td>
* <th scope="row" style="font-weight:normal; text-align:left">\u0049</th>
* <td>\u0131</td>
* <td>capital letter I -> small letter dotless i </td>
* </tr>
* <tr>
* <td>(all)</td>
* <th scope="row" style="font-weight:normal; text-align:left">French Fries</th>
* <td>french fries</td>
* <td>lowercased all chars in String</td>
* </tr>
* <tr>
* <td>(all)</td>
* <th scope="row" style="font-weight:normal; text-align:left">
* ΙΧΘΥΣ</th>
* <td>ιχθυσ</td>
* <td>lowercased all chars in String</td>
* </tr>
* </tbody>
* </table>
*
* @param locale use the case transformation rules for this locale
* @return the {@code String}, converted to lowercase.
* @see java.lang.String#toLowerCase()
* @see java.lang.String#toUpperCase()
* @see java.lang.String#toUpperCase(Locale)
* @since 1.1
*/
public String toLowerCase(Locale locale) {
return isLatin1() ? StringLatin1.toLowerCase(this, value, locale)
: StringUTF16.toLowerCase(this, value, locale);
}
Converts all of the characters in this String
to lower case using the rules of the default locale. This is equivalent to calling toLowerCase(Locale.getDefault())
.
Note: This method is locale sensitive, and may produce unexpected results if used for strings that are intended to be interpreted locale independently. Examples are programming language identifiers, protocol keys, and HTML tags. For instance, "TITLE".toLowerCase()
in a Turkish locale returns "t\u005Cu0131tle"
, where '\u005Cu0131' is the LATIN SMALL LETTER DOTLESS I character. To obtain correct results for locale insensitive strings, use toLowerCase(Locale.ROOT)
.
See Also: Returns: the String
, converted to lowercase.
/**
* Converts all of the characters in this {@code String} to lower
* case using the rules of the default locale. This is equivalent to calling
* {@code toLowerCase(Locale.getDefault())}.
* <p>
* <b>Note:</b> This method is locale sensitive, and may produce unexpected
* results if used for strings that are intended to be interpreted locale
* independently.
* Examples are programming language identifiers, protocol keys, and HTML
* tags.
* For instance, {@code "TITLE".toLowerCase()} in a Turkish locale
* returns {@code "t\u005Cu0131tle"}, where '\u005Cu0131' is the
* LATIN SMALL LETTER DOTLESS I character.
* To obtain correct results for locale insensitive strings, use
* {@code toLowerCase(Locale.ROOT)}.
*
* @return the {@code String}, converted to lowercase.
* @see java.lang.String#toLowerCase(Locale)
*/
public String toLowerCase() {
return toLowerCase(Locale.getDefault());
}
Converts all of the characters in this String
to upper case using the rules of the given Locale
. Case mapping is based on the Unicode Standard version specified by the Character
class. Since case mappings are not always 1:1 char mappings, the resulting String
may be a different length than the original String
.
Examples of locale-sensitive and 1:M case mappings are in the following table.
Examples of locale-sensitive and 1:M case mappings. Shows Language code of locale, lower case, upper case, and description.
Language Code of Locale
Lower Case
Upper Case
Description
tr (Turkish)
\u0069
\u0130
small letter i -> capital letter I with dot above
tr (Turkish)
\u0131
\u0049
small letter dotless i -> capital letter I
(all)
\u00df
\u0053 \u0053
small letter sharp s -> two letters: SS
(all)
Fahrvergnügen
FAHRVERGNÜGEN
Params: - locale – use the case transformation rules for this locale
See Also: Returns: the String
, converted to uppercase. Since: 1.1
/**
* Converts all of the characters in this {@code String} to upper
* case using the rules of the given {@code Locale}. Case mapping is based
* on the Unicode Standard version specified by the {@link java.lang.Character Character}
* class. Since case mappings are not always 1:1 char mappings, the resulting
* {@code String} may be a different length than the original {@code String}.
* <p>
* Examples of locale-sensitive and 1:M case mappings are in the following table.
*
* <table class="plain">
* <caption style="display:none">Examples of locale-sensitive and 1:M case mappings. Shows Language code of locale, lower case, upper case, and description.</caption>
* <thead>
* <tr>
* <th scope="col">Language Code of Locale</th>
* <th scope="col">Lower Case</th>
* <th scope="col">Upper Case</th>
* <th scope="col">Description</th>
* </tr>
* </thead>
* <tbody>
* <tr>
* <td>tr (Turkish)</td>
* <th scope="row" style="font-weight:normal; text-align:left">\u0069</th>
* <td>\u0130</td>
* <td>small letter i -> capital letter I with dot above</td>
* </tr>
* <tr>
* <td>tr (Turkish)</td>
* <th scope="row" style="font-weight:normal; text-align:left">\u0131</th>
* <td>\u0049</td>
* <td>small letter dotless i -> capital letter I</td>
* </tr>
* <tr>
* <td>(all)</td>
* <th scope="row" style="font-weight:normal; text-align:left">\u00df</th>
* <td>\u0053 \u0053</td>
* <td>small letter sharp s -> two letters: SS</td>
* </tr>
* <tr>
* <td>(all)</td>
* <th scope="row" style="font-weight:normal; text-align:left">Fahrvergnügen</th>
* <td>FAHRVERGNÜGEN</td>
* <td></td>
* </tr>
* </tbody>
* </table>
* @param locale use the case transformation rules for this locale
* @return the {@code String}, converted to uppercase.
* @see java.lang.String#toUpperCase()
* @see java.lang.String#toLowerCase()
* @see java.lang.String#toLowerCase(Locale)
* @since 1.1
*/
public String toUpperCase(Locale locale) {
return isLatin1() ? StringLatin1.toUpperCase(this, value, locale)
: StringUTF16.toUpperCase(this, value, locale);
}
Converts all of the characters in this String
to upper case using the rules of the default locale. This method is equivalent to toUpperCase(Locale.getDefault())
.
Note: This method is locale sensitive, and may produce unexpected results if used for strings that are intended to be interpreted locale independently. Examples are programming language identifiers, protocol keys, and HTML tags. For instance, "title".toUpperCase()
in a Turkish locale returns "T\u005Cu0130TLE"
, where '\u005Cu0130' is the LATIN CAPITAL LETTER I WITH DOT ABOVE character. To obtain correct results for locale insensitive strings, use toUpperCase(Locale.ROOT)
.
See Also: Returns: the String
, converted to uppercase.
/**
* Converts all of the characters in this {@code String} to upper
* case using the rules of the default locale. This method is equivalent to
* {@code toUpperCase(Locale.getDefault())}.
* <p>
* <b>Note:</b> This method is locale sensitive, and may produce unexpected
* results if used for strings that are intended to be interpreted locale
* independently.
* Examples are programming language identifiers, protocol keys, and HTML
* tags.
* For instance, {@code "title".toUpperCase()} in a Turkish locale
* returns {@code "T\u005Cu0130TLE"}, where '\u005Cu0130' is the
* LATIN CAPITAL LETTER I WITH DOT ABOVE character.
* To obtain correct results for locale insensitive strings, use
* {@code toUpperCase(Locale.ROOT)}.
*
* @return the {@code String}, converted to uppercase.
* @see java.lang.String#toUpperCase(Locale)
*/
public String toUpperCase() {
return toUpperCase(Locale.getDefault());
}
Returns a string whose value is this string, with all leading and trailing space removed, where space is defined as any character whose codepoint is less than or equal to 'U+0020'
(the space character). If this String
object represents an empty character sequence, or the first and last characters of character sequence represented by this String
object both have codes that are not space (as defined above), then a reference to this String
object is returned.
Otherwise, if all characters in this string are space (as defined above), then a String
object representing an empty string is returned.
Otherwise, let k be the index of the first character in the
string whose code is not a space (as defined above) and let
m be the index of the last character in the string whose code is not a space (as defined above). A String
object is returned, representing the substring of this string that begins with the character at index k and ends with the
character at index m-that is, the result of this.substring(k, m + 1)
.
This method may be used to trim space (as defined above) from
the beginning and end of a string.
Returns: a string whose value is this string, with all leading
and trailing space removed, or this string if it
has no leading or trailing space.
/**
* Returns a string whose value is this string, with all leading
* and trailing space removed, where space is defined
* as any character whose codepoint is less than or equal to
* {@code 'U+0020'} (the space character).
* <p>
* If this {@code String} object represents an empty character
* sequence, or the first and last characters of character sequence
* represented by this {@code String} object both have codes
* that are not space (as defined above), then a
* reference to this {@code String} object is returned.
* <p>
* Otherwise, if all characters in this string are space (as
* defined above), then a {@code String} object representing an
* empty string is returned.
* <p>
* Otherwise, let <i>k</i> be the index of the first character in the
* string whose code is not a space (as defined above) and let
* <i>m</i> be the index of the last character in the string whose code
* is not a space (as defined above). A {@code String}
* object is returned, representing the substring of this string that
* begins with the character at index <i>k</i> and ends with the
* character at index <i>m</i>-that is, the result of
* {@code this.substring(k, m + 1)}.
* <p>
* This method may be used to trim space (as defined above) from
* the beginning and end of a string.
*
* @return a string whose value is this string, with all leading
* and trailing space removed, or this string if it
* has no leading or trailing space.
*/
public String trim() {
String ret = isLatin1() ? StringLatin1.trim(value)
: StringUTF16.trim(value);
return ret == null ? this : ret;
}
Returns a string whose value is this string, with all leading and trailing white space removed. If this String
object represents an empty string, or if all code points in this string are white space, then an empty string is returned.
Otherwise, returns a substring of this string beginning with the first code point that is not a white space up to and including the last code point that is not a white space.
This method may be used to strip white space from the beginning and end of a string.
See Also: Returns: a string whose value is this string, with all leading
and trailing white space removed Since: 11
/**
* Returns a string whose value is this string, with all leading
* and trailing {@linkplain Character#isWhitespace(int) white space}
* removed.
* <p>
* If this {@code String} object represents an empty string,
* or if all code points in this string are
* {@linkplain Character#isWhitespace(int) white space}, then an empty string
* is returned.
* <p>
* Otherwise, returns a substring of this string beginning with the first
* code point that is not a {@linkplain Character#isWhitespace(int) white space}
* up to and including the last code point that is not a
* {@linkplain Character#isWhitespace(int) white space}.
* <p>
* This method may be used to strip
* {@linkplain Character#isWhitespace(int) white space} from
* the beginning and end of a string.
*
* @return a string whose value is this string, with all leading
* and trailing white space removed
*
* @see Character#isWhitespace(int)
*
* @since 11
*/
public String strip() {
String ret = isLatin1() ? StringLatin1.strip(value)
: StringUTF16.strip(value);
return ret == null ? this : ret;
}
Returns a string whose value is this string, with all leading white space removed. If this String
object represents an empty string, or if all code points in this string are white space, then an empty string is returned.
Otherwise, returns a substring of this string beginning with the first code point that is not a white space up to and including the last code point of this string.
This method may be used to trim white space from the beginning of a string.
See Also: Returns: a string whose value is this string, with all leading white
space removed Since: 11
/**
* Returns a string whose value is this string, with all leading
* {@linkplain Character#isWhitespace(int) white space} removed.
* <p>
* If this {@code String} object represents an empty string,
* or if all code points in this string are
* {@linkplain Character#isWhitespace(int) white space}, then an empty string
* is returned.
* <p>
* Otherwise, returns a substring of this string beginning with the first
* code point that is not a {@linkplain Character#isWhitespace(int) white space}
* up to and including the last code point of this string.
* <p>
* This method may be used to trim
* {@linkplain Character#isWhitespace(int) white space} from
* the beginning of a string.
*
* @return a string whose value is this string, with all leading white
* space removed
*
* @see Character#isWhitespace(int)
*
* @since 11
*/
public String stripLeading() {
String ret = isLatin1() ? StringLatin1.stripLeading(value)
: StringUTF16.stripLeading(value);
return ret == null ? this : ret;
}
Returns a string whose value is this string, with all trailing white space removed. If this String
object represents an empty string, or if all characters in this string are white space, then an empty string is returned.
Otherwise, returns a substring of this string beginning with the first code point of this string up to and including the last code point that is not a white space.
This method may be used to trim white space from the end of a string.
See Also: Returns: a string whose value is this string, with all trailing white
space removed Since: 11
/**
* Returns a string whose value is this string, with all trailing
* {@linkplain Character#isWhitespace(int) white space} removed.
* <p>
* If this {@code String} object represents an empty string,
* or if all characters in this string are
* {@linkplain Character#isWhitespace(int) white space}, then an empty string
* is returned.
* <p>
* Otherwise, returns a substring of this string beginning with the first
* code point of this string up to and including the last code point
* that is not a {@linkplain Character#isWhitespace(int) white space}.
* <p>
* This method may be used to trim
* {@linkplain Character#isWhitespace(int) white space} from
* the end of a string.
*
* @return a string whose value is this string, with all trailing white
* space removed
*
* @see Character#isWhitespace(int)
*
* @since 11
*/
public String stripTrailing() {
String ret = isLatin1() ? StringLatin1.stripTrailing(value)
: StringUTF16.stripTrailing(value);
return ret == null ? this : ret;
}
See Also: Returns: true
if the string is empty or contains only white space codepoints, otherwise false
Since: 11
/**
* Returns {@code true} if the string is empty or contains only
* {@linkplain Character#isWhitespace(int) white space} codepoints,
* otherwise {@code false}.
*
* @return {@code true} if the string is empty or contains only
* {@linkplain Character#isWhitespace(int) white space} codepoints,
* otherwise {@code false}
*
* @see Character#isWhitespace(int)
*
* @since 11
*/
public boolean isBlank() {
return indexOfNonWhitespace() == length();
}
Returns a stream of lines extracted from this string,
separated by line terminators.
A line terminator is one of the following: a line feed character "\n"
(U+000A), a carriage return character "\r"
(U+000D), or a carriage return followed immediately by a line feed "\r\n"
(U+000D U+000A).
A line is either a sequence of zero or more characters
followed by a line terminator, or it is a sequence of one or
more characters followed by the end of the string. A
line does not include the line terminator.
The stream returned by this method contains the lines from
this string in the order in which they occur.
API Note: This definition of line implies that an empty
string has zero lines and that there is no empty line
following a line terminator at the end of a string. Implementation Note: This method provides better performance than
split("\R") by supplying elements lazily and
by faster search of new line terminators. Returns: the stream of lines extracted from this string Since: 11
/**
* Returns a stream of lines extracted from this string,
* separated by line terminators.
* <p>
* A <i>line terminator</i> is one of the following:
* a line feed character {@code "\n"} (U+000A),
* a carriage return character {@code "\r"} (U+000D),
* or a carriage return followed immediately by a line feed
* {@code "\r\n"} (U+000D U+000A).
* <p>
* A <i>line</i> is either a sequence of zero or more characters
* followed by a line terminator, or it is a sequence of one or
* more characters followed by the end of the string. A
* line does not include the line terminator.
* <p>
* The stream returned by this method contains the lines from
* this string in the order in which they occur.
*
* @apiNote This definition of <i>line</i> implies that an empty
* string has zero lines and that there is no empty line
* following a line terminator at the end of a string.
*
* @implNote This method provides better performance than
* split("\R") by supplying elements lazily and
* by faster search of new line terminators.
*
* @return the stream of lines extracted from this string
*
* @since 11
*/
public Stream<String> lines() {
return isLatin1() ? StringLatin1.lines(value) : StringUTF16.lines(value);
}
Adjusts the indentation of each line of this string based on the value of n
, and normalizes line termination characters. This string is conceptually separated into lines using lines()
. Each line is then adjusted as described below and then suffixed with a line feed "\n"
(U+000A). The resulting lines are then concatenated and returned.
If n > 0
then n
spaces (U+0020) are inserted at the beginning of each line.
If n < 0
then up to n
white space characters are removed from the beginning of each line. If a given line does not contain sufficient white space then all leading white space characters are removed. Each white space character is treated as a single character. In particular, the tab character "\t"
(U+0009) is considered a single character; it is not expanded.
If n == 0
then the line remains unchanged. However, line terminators are still normalized.
Params: - n – number of leading white space characters to add or remove
See Also: Returns: string with indentation adjusted and line endings normalized Since: 12
/**
* Adjusts the indentation of each line of this string based on the value of
* {@code n}, and normalizes line termination characters.
* <p>
* This string is conceptually separated into lines using
* {@link String#lines()}. Each line is then adjusted as described below
* and then suffixed with a line feed {@code "\n"} (U+000A). The resulting
* lines are then concatenated and returned.
* <p>
* If {@code n > 0} then {@code n} spaces (U+0020) are inserted at the
* beginning of each line.
* <p>
* If {@code n < 0} then up to {@code n}
* {@linkplain Character#isWhitespace(int) white space characters} are removed
* from the beginning of each line. If a given line does not contain
* sufficient white space then all leading
* {@linkplain Character#isWhitespace(int) white space characters} are removed.
* Each white space character is treated as a single character. In
* particular, the tab character {@code "\t"} (U+0009) is considered a
* single character; it is not expanded.
* <p>
* If {@code n == 0} then the line remains unchanged. However, line
* terminators are still normalized.
*
* @param n number of leading
* {@linkplain Character#isWhitespace(int) white space characters}
* to add or remove
*
* @return string with indentation adjusted and line endings normalized
*
* @see String#lines()
* @see String#isBlank()
* @see Character#isWhitespace(int)
*
* @since 12
*/
public String indent(int n) {
if (isEmpty()) {
return "";
}
Stream<String> stream = lines();
if (n > 0) {
final String spaces = " ".repeat(n);
stream = stream.map(s -> spaces + s);
} else if (n == Integer.MIN_VALUE) {
stream = stream.map(s -> s.stripLeading());
} else if (n < 0) {
stream = stream.map(s -> s.substring(Math.min(-n, s.indexOfNonWhitespace())));
}
return stream.collect(Collectors.joining("\n", "", "\n"));
}
private int indexOfNonWhitespace() {
return isLatin1() ? StringLatin1.indexOfNonWhitespace(value)
: StringUTF16.indexOfNonWhitespace(value);
}
private int lastIndexOfNonWhitespace() {
return isLatin1() ? StringLatin1.lastIndexOfNonWhitespace(value)
: StringUTF16.lastIndexOfNonWhitespace(value);
}
{@preview Associated with text blocks, a preview feature of
the Java language.
This method is associated with text blocks, a preview feature of the Java language. Programs can only use this method when preview features are enabled. Preview features may be removed in a future release, or upgraded to permanent features of the Java language.} Returns a string whose value is this string, with incidental white space removed from the beginning and end of every line. Incidental white space is often present in a text block to align the content with the opening delimiter. For example, in the following code, dots represent incidental white space:
String html = """
..............<html>
.............. <body>
.............. <p>Hello, world</p>
.............. </body>
..............</html>
..............""";
This method treats the incidental white space as indentation to be stripped, producing a string that preserves the relative indentation of the content. Using | to visualize the start of each line of the string: |<html>
| <body>
| <p>Hello, world</p>
| </body>
|</html>
First, the individual lines of this string are extracted as if by using lines()
.
Then, the minimum indentation (min) is determined as follows. For each non-blank line (as defined by isBlank()
), the leading white space characters are counted. The leading white space characters on the last line are also counted even if blank. The min value is the smallest
of these counts.
For each non-blank line, min leading white space characters are removed, and any trailing white space characters are removed. Blank lines are replaced with the empty string.
Finally, the lines are joined into a new string, using the LF character "\n"
(U+000A) to separate lines.
See Also: API Note: This method's primary purpose is to shift a block of lines as far as possible to the left, while preserving relative indentation. Lines that were indented the least will thus have no leading white space. The line count of the result will be the same as line count of this string. If this string ends with a line terminator then the result will end with a line terminator. Implementation Note: This method treats all white space characters as having equal width. As long as the indentation on every line is consistently composed of the same character sequences, then the result will be as described above. Returns: string with incidental indentation removed and line
terminators normalized Since: 13
/**
* {@preview Associated with text blocks, a preview feature of
* the Java language.
*
* This method is associated with <i>text blocks</i>, a preview
* feature of the Java language. Programs can only use this
* method when preview features are enabled. Preview features
* may be removed in a future release, or upgraded to permanent
* features of the Java language.}
*
* Returns a string whose value is this string, with incidental
* {@linkplain Character#isWhitespace(int) white space} removed from
* the beginning and end of every line.
* <p>
* Incidental {@linkplain Character#isWhitespace(int) white space}
* is often present in a text block to align the content with the opening
* delimiter. For example, in the following code, dots represent incidental
* {@linkplain Character#isWhitespace(int) white space}:
* <blockquote><pre>
* String html = """
* ..............<html>
* .............. <body>
* .............. <p>Hello, world</p>
* .............. </body>
* ..............</html>
* ..............""";
* </pre></blockquote>
* This method treats the incidental
* {@linkplain Character#isWhitespace(int) white space} as indentation to be
* stripped, producing a string that preserves the relative indentation of
* the content. Using | to visualize the start of each line of the string:
* <blockquote><pre>
* |<html>
* | <body>
* | <p>Hello, world</p>
* | </body>
* |</html>
* </pre></blockquote>
* First, the individual lines of this string are extracted as if by using
* {@link String#lines()}.
* <p>
* Then, the <i>minimum indentation</i> (min) is determined as follows.
* For each non-blank line (as defined by {@link String#isBlank()}), the
* leading {@linkplain Character#isWhitespace(int) white space} characters are
* counted. The leading {@linkplain Character#isWhitespace(int) white space}
* characters on the last line are also counted even if
* {@linkplain String#isBlank() blank}. The <i>min</i> value is the smallest
* of these counts.
* <p>
* For each {@linkplain String#isBlank() non-blank} line, <i>min</i> leading
* {@linkplain Character#isWhitespace(int) white space} characters are removed,
* and any trailing {@linkplain Character#isWhitespace(int) white space}
* characters are removed. {@linkplain String#isBlank() Blank} lines are
* replaced with the empty string.
*
* <p>
* Finally, the lines are joined into a new string, using the LF character
* {@code "\n"} (U+000A) to separate lines.
*
* @apiNote
* This method's primary purpose is to shift a block of lines as far as
* possible to the left, while preserving relative indentation. Lines
* that were indented the least will thus have no leading
* {@linkplain Character#isWhitespace(int) white space}.
* The line count of the result will be the same as line count of this
* string.
* If this string ends with a line terminator then the result will end
* with a line terminator.
*
* @implNote
* This method treats all {@linkplain Character#isWhitespace(int) white space}
* characters as having equal width. As long as the indentation on every
* line is consistently composed of the same character sequences, then the
* result will be as described above.
*
* @return string with incidental indentation removed and line
* terminators normalized
*
* @see String#lines()
* @see String#isBlank()
* @see String#indent(int)
* @see Character#isWhitespace(int)
*
* @since 13
*
*/
@jdk.internal.PreviewFeature(feature=jdk.internal.PreviewFeature.Feature.TEXT_BLOCKS,
essentialAPI=true)
public String stripIndent() {
int length = length();
if (length == 0) {
return "";
}
char lastChar = charAt(length - 1);
boolean optOut = lastChar == '\n' || lastChar == '\r';
List<String> lines = lines().collect(Collectors.toList());
final int outdent = optOut ? 0 : outdent(lines);
return lines.stream()
.map(line -> {
int firstNonWhitespace = line.indexOfNonWhitespace();
int lastNonWhitespace = line.lastIndexOfNonWhitespace();
int incidentalWhitespace = Math.min(outdent, firstNonWhitespace);
return firstNonWhitespace > lastNonWhitespace
? "" : line.substring(incidentalWhitespace, lastNonWhitespace);
})
.collect(Collectors.joining("\n", "", optOut ? "\n" : ""));
}
private static int outdent(List<String> lines) {
// Note: outdent is guaranteed to be zero or positive number.
// If there isn't a non-blank line then the last must be blank
int outdent = Integer.MAX_VALUE;
for (String line : lines) {
int leadingWhitespace = line.indexOfNonWhitespace();
if (leadingWhitespace != line.length()) {
outdent = Integer.min(outdent, leadingWhitespace);
}
}
String lastLine = lines.get(lines.size() - 1);
if (lastLine.isBlank()) {
outdent = Integer.min(outdent, lastLine.length());
}
return outdent;
}
{@preview Associated with text blocks, a preview feature of
the Java language.
This method is associated with text blocks, a preview
feature of the Java language. Programs can only use this
method when preview features are enabled. Preview features
may be removed in a future release, or upgraded to permanent
features of the Java language.}
Returns a string whose value is this string, with escape sequences
translated as if in a string literal.
Escape sequences are translated as follows;
Translation
Escape
Name
Translation
\u005Cb
backspace
U+0008
\u005Ct
horizontal tab
U+0009
\u005Cn
line feed
U+000A
\u005Cf
form feed
U+000C
\u005Cr
carriage return
U+000D
\u005Cs
space
U+0020
\u005C"
double quote
U+0022
\u005C'
single quote
U+0027
\u005C\u005C
backslash
U+005C
\u005C0 - \u005C377
octal escape
code point equivalents
\u005C<line-terminator>
continuation
discard
Throws: - IllegalArgumentException – when an escape sequence is malformed.
Implementation Note:
This method does not translate Unicode escapes such as "\u005cu2022
". Unicode escapes are translated by the Java compiler when reading input characters and are not part of the string literal specification. Returns: String with escape sequences translated. @jls 3.10.7 Escape Sequences Since: 13
/**
* {@preview Associated with text blocks, a preview feature of
* the Java language.
*
* This method is associated with <i>text blocks</i>, a preview
* feature of the Java language. Programs can only use this
* method when preview features are enabled. Preview features
* may be removed in a future release, or upgraded to permanent
* features of the Java language.}
*
* Returns a string whose value is this string, with escape sequences
* translated as if in a string literal.
* <p>
* Escape sequences are translated as follows;
* <table class="striped">
* <caption style="display:none">Translation</caption>
* <thead>
* <tr>
* <th scope="col">Escape</th>
* <th scope="col">Name</th>
* <th scope="col">Translation</th>
* </tr>
* </thead>
* <tbody>
* <tr>
* <th scope="row">{@code \u005Cb}</th>
* <td>backspace</td>
* <td>{@code U+0008}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005Ct}</th>
* <td>horizontal tab</td>
* <td>{@code U+0009}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005Cn}</th>
* <td>line feed</td>
* <td>{@code U+000A}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005Cf}</th>
* <td>form feed</td>
* <td>{@code U+000C}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005Cr}</th>
* <td>carriage return</td>
* <td>{@code U+000D}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005Cs}</th>
* <td>space</td>
* <td>{@code U+0020}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005C"}</th>
* <td>double quote</td>
* <td>{@code U+0022}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005C'}</th>
* <td>single quote</td>
* <td>{@code U+0027}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005C\u005C}</th>
* <td>backslash</td>
* <td>{@code U+005C}</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005C0 - \u005C377}</th>
* <td>octal escape</td>
* <td>code point equivalents</td>
* </tr>
* <tr>
* <th scope="row">{@code \u005C<line-terminator>}</th>
* <td>continuation</td>
* <td>discard</td>
* </tr>
* </tbody>
* </table>
*
* @implNote
* This method does <em>not</em> translate Unicode escapes such as "{@code \u005cu2022}".
* Unicode escapes are translated by the Java compiler when reading input characters and
* are not part of the string literal specification.
*
* @throws IllegalArgumentException when an escape sequence is malformed.
*
* @return String with escape sequences translated.
*
* @jls 3.10.7 Escape Sequences
*
* @since 13
*/
@jdk.internal.PreviewFeature(feature=jdk.internal.PreviewFeature.Feature.TEXT_BLOCKS,
essentialAPI=true)
public String translateEscapes() {
if (isEmpty()) {
return "";
}
char[] chars = toCharArray();
int length = chars.length;
int from = 0;
int to = 0;
while (from < length) {
char ch = chars[from++];
if (ch == '\\') {
ch = from < length ? chars[from++] : '\0';
switch (ch) {
case 'b':
ch = '\b';
break;
case 'f':
ch = '\f';
break;
case 'n':
ch = '\n';
break;
case 'r':
ch = '\r';
break;
case 's':
ch = ' ';
break;
case 't':
ch = '\t';
break;
case '\'':
case '\"':
case '\\':
// as is
break;
case '0': case '1': case '2': case '3':
case '4': case '5': case '6': case '7':
int limit = Integer.min(from + (ch <= '3' ? 2 : 1), length);
int code = ch - '0';
while (from < limit) {
ch = chars[from];
if (ch < '0' || '7' < ch) {
break;
}
from++;
code = (code << 3) | (ch - '0');
}
ch = (char)code;
break;
case '\n':
continue;
case '\r':
if (from < length && chars[from] == '\n') {
from++;
}
continue;
default: {
String msg = String.format(
"Invalid escape sequence: \\%c \\\\u%04X",
ch, (int)ch);
throw new IllegalArgumentException(msg);
}
}
}
chars[to++] = ch;
}
return new String(chars, 0, to);
}
This method allows the application of a function to this
string. The function should expect a single String argument and produce an R
result. Any exception thrown by f()
will be propagated to the caller.
Params: - f – functional interface to a apply
Type parameters: - <R> – class of the result
See Also: Returns: the result of applying the function to this string Since: 12
/**
* This method allows the application of a function to {@code this}
* string. The function should expect a single String argument
* and produce an {@code R} result.
* <p>
* Any exception thrown by {@code f()} will be propagated to the
* caller.
*
* @param f functional interface to a apply
*
* @param <R> class of the result
*
* @return the result of applying the function to this string
*
* @see java.util.function.Function
*
* @since 12
*/
public <R> R transform(Function<? super String, ? extends R> f) {
return f.apply(this);
}
This object (which is already a string!) is itself returned.
Returns: the string itself.
/**
* This object (which is already a string!) is itself returned.
*
* @return the string itself.
*/
public String toString() {
return this;
}
Returns a stream of int
zero-extending the char
values from this sequence. Any char which maps to a surrogate code
point is passed through uninterpreted.
Returns: an IntStream of char values from this sequence Since: 9
/**
* Returns a stream of {@code int} zero-extending the {@code char} values
* from this sequence. Any char which maps to a <a
* href="{@docRoot}/java.base/java/lang/Character.html#unicode">surrogate code
* point</a> is passed through uninterpreted.
*
* @return an IntStream of char values from this sequence
* @since 9
*/
@Override
public IntStream chars() {
return StreamSupport.intStream(
isLatin1() ? new StringLatin1.CharsSpliterator(value, Spliterator.IMMUTABLE)
: new StringUTF16.CharsSpliterator(value, Spliterator.IMMUTABLE),
false);
}
Returns a stream of code point values from this sequence. Any surrogate pairs encountered in the sequence are combined as if by Character.toCodePoint and the result is passed to the stream. Any other code units, including ordinary BMP characters, unpaired surrogates, and undefined code units, are zero-extended to int
values which are then passed to the stream. Returns: an IntStream of Unicode code points from this sequence Since: 9
/**
* Returns a stream of code point values from this sequence. Any surrogate
* pairs encountered in the sequence are combined as if by {@linkplain
* Character#toCodePoint Character.toCodePoint} and the result is passed
* to the stream. Any other code units, including ordinary BMP characters,
* unpaired surrogates, and undefined code units, are zero-extended to
* {@code int} values which are then passed to the stream.
*
* @return an IntStream of Unicode code points from this sequence
* @since 9
*/
@Override
public IntStream codePoints() {
return StreamSupport.intStream(
isLatin1() ? new StringLatin1.CharsSpliterator(value, Spliterator.IMMUTABLE)
: new StringUTF16.CodePointsSpliterator(value, Spliterator.IMMUTABLE),
false);
}
Converts this string to a new character array.
Returns: a newly allocated character array whose length is the length
of this string and whose contents are initialized to contain
the character sequence represented by this string.
/**
* Converts this string to a new character array.
*
* @return a newly allocated character array whose length is the length
* of this string and whose contents are initialized to contain
* the character sequence represented by this string.
*/
public char[] toCharArray() {
return isLatin1() ? StringLatin1.toChars(value)
: StringUTF16.toChars(value);
}
Returns a formatted string using the specified format string and
arguments.
The locale always used is the one returned by
Locale.getDefault(Locale.Category)
with FORMAT
category specified.
Params: - format –
A format string
- args –
Arguments referenced by the format specifiers in the format
string. If there are more arguments than format specifiers, the
extra arguments are ignored. The number of arguments is
variable and may be zero. The maximum number of arguments is
limited by the maximum dimension of a Java array as defined by
The Java™ Virtual Machine Specification. The behaviour on a
null
argument depends on the conversion.
Throws: - IllegalFormatException –
If a format string contains an illegal syntax, a format
specifier that is incompatible with the given arguments,
insufficient arguments given the format string, or other
illegal conditions. For specification of all possible
formatting errors, see the Details section of the
formatter class specification.
See Also: Returns: A formatted string Since: 1.5
/**
* Returns a formatted string using the specified format string and
* arguments.
*
* <p> The locale always used is the one returned by {@link
* java.util.Locale#getDefault(java.util.Locale.Category)
* Locale.getDefault(Locale.Category)} with
* {@link java.util.Locale.Category#FORMAT FORMAT} category specified.
*
* @param format
* A <a href="../util/Formatter.html#syntax">format string</a>
*
* @param args
* Arguments referenced by the format specifiers in the format
* string. If there are more arguments than format specifiers, the
* extra arguments are ignored. The number of arguments is
* variable and may be zero. The maximum number of arguments is
* limited by the maximum dimension of a Java array as defined by
* <cite>The Java™ Virtual Machine Specification</cite>.
* The behaviour on a
* {@code null} argument depends on the <a
* href="../util/Formatter.html#syntax">conversion</a>.
*
* @throws java.util.IllegalFormatException
* If a format string contains an illegal syntax, a format
* specifier that is incompatible with the given arguments,
* insufficient arguments given the format string, or other
* illegal conditions. For specification of all possible
* formatting errors, see the <a
* href="../util/Formatter.html#detail">Details</a> section of the
* formatter class specification.
*
* @return A formatted string
*
* @see java.util.Formatter
* @since 1.5
*/
public static String format(String format, Object... args) {
return new Formatter().format(format, args).toString();
}
Returns a formatted string using the specified locale, format string,
and arguments.
Params: - l – The locale to apply during formatting. If
l
is null
then no localization is applied. - format –
A format string
- args –
Arguments referenced by the format specifiers in the format
string. If there are more arguments than format specifiers, the
extra arguments are ignored. The number of arguments is
variable and may be zero. The maximum number of arguments is
limited by the maximum dimension of a Java array as defined by
The Java™ Virtual Machine Specification. The behaviour on a
null
argument depends on the conversion.
Throws: - IllegalFormatException –
If a format string contains an illegal syntax, a format
specifier that is incompatible with the given arguments,
insufficient arguments given the format string, or other
illegal conditions. For specification of all possible
formatting errors, see the Details section of the
formatter class specification
See Also: Returns: A formatted string Since: 1.5
/**
* Returns a formatted string using the specified locale, format string,
* and arguments.
*
* @param l
* The {@linkplain java.util.Locale locale} to apply during
* formatting. If {@code l} is {@code null} then no localization
* is applied.
*
* @param format
* A <a href="../util/Formatter.html#syntax">format string</a>
*
* @param args
* Arguments referenced by the format specifiers in the format
* string. If there are more arguments than format specifiers, the
* extra arguments are ignored. The number of arguments is
* variable and may be zero. The maximum number of arguments is
* limited by the maximum dimension of a Java array as defined by
* <cite>The Java™ Virtual Machine Specification</cite>.
* The behaviour on a
* {@code null} argument depends on the
* <a href="../util/Formatter.html#syntax">conversion</a>.
*
* @throws java.util.IllegalFormatException
* If a format string contains an illegal syntax, a format
* specifier that is incompatible with the given arguments,
* insufficient arguments given the format string, or other
* illegal conditions. For specification of all possible
* formatting errors, see the <a
* href="../util/Formatter.html#detail">Details</a> section of the
* formatter class specification
*
* @return A formatted string
*
* @see java.util.Formatter
* @since 1.5
*/
public static String format(Locale l, String format, Object... args) {
return new Formatter(l).format(format, args).toString();
}
{@preview Associated with text blocks, a preview feature of
the Java language.
This method is associated with text blocks, a preview
feature of the Java language. Programs can only use this
method when preview features are enabled. Preview features
may be removed in a future release, or upgraded to permanent
features of the Java language.}
Formats using this string as the format string, and the supplied
arguments.
Params: - args –
Arguments referenced by the format specifiers in this string.
See Also: Implementation Requirements: This method is equivalent to String.format(this, args)
. Returns: A formatted string Since: 13
/**
* {@preview Associated with text blocks, a preview feature of
* the Java language.
*
* This method is associated with <i>text blocks</i>, a preview
* feature of the Java language. Programs can only use this
* method when preview features are enabled. Preview features
* may be removed in a future release, or upgraded to permanent
* features of the Java language.}
*
* Formats using this string as the format string, and the supplied
* arguments.
*
* @implSpec This method is equivalent to {@code String.format(this, args)}.
*
* @param args
* Arguments referenced by the format specifiers in this string.
*
* @return A formatted string
*
* @see java.lang.String#format(String,Object...)
* @see java.util.Formatter
*
* @since 13
*
*/
@jdk.internal.PreviewFeature(feature=jdk.internal.PreviewFeature.Feature.TEXT_BLOCKS,
essentialAPI=true)
public String formatted(Object... args) {
return new Formatter().format(this, args).toString();
}
Returns the string representation of the Object
argument. Params: - obj – an
Object
.
See Also: Returns: if the argument is null
, then a string equal to "null"
; otherwise, the value of obj.toString()
is returned.
/**
* Returns the string representation of the {@code Object} argument.
*
* @param obj an {@code Object}.
* @return if the argument is {@code null}, then a string equal to
* {@code "null"}; otherwise, the value of
* {@code obj.toString()} is returned.
* @see java.lang.Object#toString()
*/
public static String valueOf(Object obj) {
return (obj == null) ? "null" : obj.toString();
}
Returns the string representation of the char
array argument. The contents of the character array are copied; subsequent modification of the character array does not affect the returned string. Params: - data – the character array.
Returns: a String
that contains the characters of the character array.
/**
* Returns the string representation of the {@code char} array
* argument. The contents of the character array are copied; subsequent
* modification of the character array does not affect the returned
* string.
*
* @param data the character array.
* @return a {@code String} that contains the characters of the
* character array.
*/
public static String valueOf(char data[]) {
return new String(data);
}
Returns the string representation of a specific subarray of the char
array argument. The offset
argument is the index of the first character of the subarray. The count
argument specifies the length of the subarray. The contents of the subarray are copied; subsequent modification of the character array does not affect the returned string.
Params: - data – the character array.
- offset – initial offset of the subarray.
- count – length of the subarray.
Throws: - IndexOutOfBoundsException – if
offset
is negative, or count
is negative, or offset+count
is larger than data.length
.
Returns: a String
that contains the characters of the specified subarray of the character array.
/**
* Returns the string representation of a specific subarray of the
* {@code char} array argument.
* <p>
* The {@code offset} argument is the index of the first
* character of the subarray. The {@code count} argument
* specifies the length of the subarray. The contents of the subarray
* are copied; subsequent modification of the character array does not
* affect the returned string.
*
* @param data the character array.
* @param offset initial offset of the subarray.
* @param count length of the subarray.
* @return a {@code String} that contains the characters of the
* specified subarray of the character array.
* @throws IndexOutOfBoundsException if {@code offset} is
* negative, or {@code count} is negative, or
* {@code offset+count} is larger than
* {@code data.length}.
*/
public static String valueOf(char data[], int offset, int count) {
return new String(data, offset, count);
}
Equivalent to valueOf(char[], int, int)
. Params: - data – the character array.
- offset – initial offset of the subarray.
- count – length of the subarray.
Throws: - IndexOutOfBoundsException – if
offset
is negative, or count
is negative, or offset+count
is larger than data.length
.
Returns: a String
that contains the characters of the specified subarray of the character array.
/**
* Equivalent to {@link #valueOf(char[], int, int)}.
*
* @param data the character array.
* @param offset initial offset of the subarray.
* @param count length of the subarray.
* @return a {@code String} that contains the characters of the
* specified subarray of the character array.
* @throws IndexOutOfBoundsException if {@code offset} is
* negative, or {@code count} is negative, or
* {@code offset+count} is larger than
* {@code data.length}.
*/
public static String copyValueOf(char data[], int offset, int count) {
return new String(data, offset, count);
}
Equivalent to valueOf(char[])
. Params: - data – the character array.
Returns: a String
that contains the characters of the character array.
/**
* Equivalent to {@link #valueOf(char[])}.
*
* @param data the character array.
* @return a {@code String} that contains the characters of the
* character array.
*/
public static String copyValueOf(char data[]) {
return new String(data);
}
Returns the string representation of the boolean
argument. Params: - b – a
boolean
.
Returns: if the argument is true
, a string equal to "true"
is returned; otherwise, a string equal to "false"
is returned.
/**
* Returns the string representation of the {@code boolean} argument.
*
* @param b a {@code boolean}.
* @return if the argument is {@code true}, a string equal to
* {@code "true"} is returned; otherwise, a string equal to
* {@code "false"} is returned.
*/
public static String valueOf(boolean b) {
return b ? "true" : "false";
}
Returns the string representation of the char
argument. Params: - c – a
char
.
Returns: a string of length 1
containing as its single character the argument c
.
/**
* Returns the string representation of the {@code char}
* argument.
*
* @param c a {@code char}.
* @return a string of length {@code 1} containing
* as its single character the argument {@code c}.
*/
public static String valueOf(char c) {
if (COMPACT_STRINGS && StringLatin1.canEncode(c)) {
return new String(StringLatin1.toBytes(c), LATIN1);
}
return new String(StringUTF16.toBytes(c), UTF16);
}
Returns the string representation of the int
argument. The representation is exactly the one returned by the Integer.toString
method of one argument.
Params: - i – an
int
.
See Also: Returns: a string representation of the int
argument.
/**
* Returns the string representation of the {@code int} argument.
* <p>
* The representation is exactly the one returned by the
* {@code Integer.toString} method of one argument.
*
* @param i an {@code int}.
* @return a string representation of the {@code int} argument.
* @see java.lang.Integer#toString(int, int)
*/
public static String valueOf(int i) {
return Integer.toString(i);
}
Returns the string representation of the long
argument. The representation is exactly the one returned by the Long.toString
method of one argument.
Params: - l – a
long
.
See Also: Returns: a string representation of the long
argument.
/**
* Returns the string representation of the {@code long} argument.
* <p>
* The representation is exactly the one returned by the
* {@code Long.toString} method of one argument.
*
* @param l a {@code long}.
* @return a string representation of the {@code long} argument.
* @see java.lang.Long#toString(long)
*/
public static String valueOf(long l) {
return Long.toString(l);
}
Returns the string representation of the float
argument. The representation is exactly the one returned by the Float.toString
method of one argument.
Params: - f – a
float
.
See Also: Returns: a string representation of the float
argument.
/**
* Returns the string representation of the {@code float} argument.
* <p>
* The representation is exactly the one returned by the
* {@code Float.toString} method of one argument.
*
* @param f a {@code float}.
* @return a string representation of the {@code float} argument.
* @see java.lang.Float#toString(float)
*/
public static String valueOf(float f) {
return Float.toString(f);
}
Returns the string representation of the double
argument. The representation is exactly the one returned by the Double.toString
method of one argument.
Params: - d – a
double
.
See Also: Returns: a string representation of the double
argument.
/**
* Returns the string representation of the {@code double} argument.
* <p>
* The representation is exactly the one returned by the
* {@code Double.toString} method of one argument.
*
* @param d a {@code double}.
* @return a string representation of the {@code double} argument.
* @see java.lang.Double#toString(double)
*/
public static String valueOf(double d) {
return Double.toString(d);
}
Returns a canonical representation for the string object.
A pool of strings, initially empty, is maintained privately by the class String
.
When the intern method is invoked, if the pool already contains a string equal to this String
object as determined by the equals(Object)
method, then the string from the pool is returned. Otherwise, this String
object is added to the pool and a reference to this String
object is returned.
It follows that for any two strings s
and t
, s.intern() == t.intern()
is true
if and only if s.equals(t)
is true
.
All literal strings and string-valued constant expressions are
interned. String literals are defined in section 3.10.5 of the
The Java™ Language Specification.
Returns: a string that has the same contents as this string, but is
guaranteed to be from a pool of unique strings. @jls 3.10.5 String Literals
/**
* Returns a canonical representation for the string object.
* <p>
* A pool of strings, initially empty, is maintained privately by the
* class {@code String}.
* <p>
* When the intern method is invoked, if the pool already contains a
* string equal to this {@code String} object as determined by
* the {@link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {@code String} object is added to the
* pool and a reference to this {@code String} object is returned.
* <p>
* It follows that for any two strings {@code s} and {@code t},
* {@code s.intern() == t.intern()} is {@code true}
* if and only if {@code s.equals(t)} is {@code true}.
* <p>
* All literal strings and string-valued constant expressions are
* interned. String literals are defined in section 3.10.5 of the
* <cite>The Java™ Language Specification</cite>.
*
* @return a string that has the same contents as this string, but is
* guaranteed to be from a pool of unique strings.
* @jls 3.10.5 String Literals
*/
public native String intern();
Returns a string whose value is the concatenation of this string repeated count
times.
If this string is empty or count is zero then the empty
string is returned.
Params: - count – number of times to repeat
Throws: - IllegalArgumentException – if the
count
is negative.
Returns: A string composed of this string repeated count
times or the empty string if this string is empty or count is zero Since: 11
/**
* Returns a string whose value is the concatenation of this
* string repeated {@code count} times.
* <p>
* If this string is empty or count is zero then the empty
* string is returned.
*
* @param count number of times to repeat
*
* @return A string composed of this string repeated
* {@code count} times or the empty string if this
* string is empty or count is zero
*
* @throws IllegalArgumentException if the {@code count} is
* negative.
*
* @since 11
*/
public String repeat(int count) {
if (count < 0) {
throw new IllegalArgumentException("count is negative: " + count);
}
if (count == 1) {
return this;
}
final int len = value.length;
if (len == 0 || count == 0) {
return "";
}
if (len == 1) {
final byte[] single = new byte[count];
Arrays.fill(single, value[0]);
return new String(single, coder);
}
if (Integer.MAX_VALUE / count < len) {
throw new OutOfMemoryError("Repeating " + len + " bytes String " + count +
" times will produce a String exceeding maximum size.");
}
final int limit = len * count;
final byte[] multiple = new byte[limit];
System.arraycopy(value, 0, multiple, 0, len);
int copied = len;
for (; copied < limit - copied; copied <<= 1) {
System.arraycopy(multiple, 0, multiple, copied, copied);
}
System.arraycopy(multiple, 0, multiple, copied, limit - copied);
return new String(multiple, coder);
}
////////////////////////////////////////////////////////////////
Copy character bytes from this string into dst starting at dstBegin.
This method doesn't perform any range checking.
Invoker guarantees: dst is in UTF16 (inflate itself for asb), if two
coders are different, and dst is big enough (range check)
Params: - dstBegin – the char index, not offset of byte[]
- coder – the coder of dst[]
/**
* Copy character bytes from this string into dst starting at dstBegin.
* This method doesn't perform any range checking.
*
* Invoker guarantees: dst is in UTF16 (inflate itself for asb), if two
* coders are different, and dst is big enough (range check)
*
* @param dstBegin the char index, not offset of byte[]
* @param coder the coder of dst[]
*/
void getBytes(byte dst[], int dstBegin, byte coder) {
if (coder() == coder) {
System.arraycopy(value, 0, dst, dstBegin << coder, value.length);
} else { // this.coder == LATIN && coder == UTF16
StringLatin1.inflate(value, 0, dst, dstBegin, value.length);
}
}
/*
* Package private constructor. Trailing Void argument is there for
* disambiguating it against other (public) constructors.
*
* Stores the char[] value into a byte[] that each byte represents
* the8 low-order bits of the corresponding character, if the char[]
* contains only latin1 character. Or a byte[] that stores all
* characters in their byte sequences defined by the {@code StringUTF16}.
*/
String(char[] value, int off, int len, Void sig) {
if (len == 0) {
this.value = "".value;
this.coder = "".coder;
return;
}
if (COMPACT_STRINGS) {
byte[] val = StringUTF16.compress(value, off, len);
if (val != null) {
this.value = val;
this.coder = LATIN1;
return;
}
}
this.coder = UTF16;
this.value = StringUTF16.toBytes(value, off, len);
}
/*
* Package private constructor. Trailing Void argument is there for
* disambiguating it against other (public) constructors.
*/
String(AbstractStringBuilder asb, Void sig) {
byte[] val = asb.getValue();
int length = asb.length();
if (asb.isLatin1()) {
this.coder = LATIN1;
this.value = Arrays.copyOfRange(val, 0, length);
} else {
if (COMPACT_STRINGS) {
byte[] buf = StringUTF16.compress(val, 0, length);
if (buf != null) {
this.coder = LATIN1;
this.value = buf;
return;
}
}
this.coder = UTF16;
this.value = Arrays.copyOfRange(val, 0, length << 1);
}
}
/*
* Package private constructor which shares value array for speed.
*/
String(byte[] value, byte coder) {
this.value = value;
this.coder = coder;
}
byte coder() {
return COMPACT_STRINGS ? coder : UTF16;
}
byte[] value() {
return value;
}
boolean isLatin1() {
return COMPACT_STRINGS && coder == LATIN1;
}
@Native static final byte LATIN1 = 0;
@Native static final byte UTF16 = 1;
/*
* StringIndexOutOfBoundsException if {@code index} is
* negative or greater than or equal to {@code length}.
*/
static void checkIndex(int index, int length) {
if (index < 0 || index >= length) {
throw new StringIndexOutOfBoundsException("index " + index +
", length " + length);
}
}
/*
* StringIndexOutOfBoundsException if {@code offset}
* is negative or greater than {@code length}.
*/
static void checkOffset(int offset, int length) {
if (offset < 0 || offset > length) {
throw new StringIndexOutOfBoundsException("offset " + offset +
", length " + length);
}
}
/*
* Check {@code offset}, {@code count} against {@code 0} and {@code length}
* bounds.
*
* @throws StringIndexOutOfBoundsException
* If {@code offset} is negative, {@code count} is negative,
* or {@code offset} is greater than {@code length - count}
*/
static void checkBoundsOffCount(int offset, int count, int length) {
if (offset < 0 || count < 0 || offset > length - count) {
throw new StringIndexOutOfBoundsException(
"offset " + offset + ", count " + count + ", length " + length);
}
}
/*
* Check {@code begin}, {@code end} against {@code 0} and {@code length}
* bounds.
*
* @throws StringIndexOutOfBoundsException
* If {@code begin} is negative, {@code begin} is greater than
* {@code end}, or {@code end} is greater than {@code length}.
*/
static void checkBoundsBeginEnd(int begin, int end, int length) {
if (begin < 0 || begin > end || end > length) {
throw new StringIndexOutOfBoundsException(
"begin " + begin + ", end " + end + ", length " + length);
}
}
Returns the string representation of the codePoint
argument. Params: - codePoint – a
codePoint
.
Throws: - IllegalArgumentException – if the specified
codePoint
is not a
valid Unicode code point.
Returns: a string of length 1
or 2
containing as its single character the argument codePoint
.
/**
* Returns the string representation of the {@code codePoint}
* argument.
*
* @param codePoint a {@code codePoint}.
* @return a string of length {@code 1} or {@code 2} containing
* as its single character the argument {@code codePoint}.
* @throws IllegalArgumentException if the specified
* {@code codePoint} is not a {@linkplain Character#isValidCodePoint
* valid Unicode code point}.
*/
static String valueOfCodePoint(int codePoint) {
if (COMPACT_STRINGS && StringLatin1.canEncode(codePoint)) {
return new String(StringLatin1.toBytes((char)codePoint), LATIN1);
} else if (Character.isBmpCodePoint(codePoint)) {
return new String(StringUTF16.toBytes((char)codePoint), UTF16);
} else if (Character.isSupplementaryCodePoint(codePoint)) {
return new String(StringUTF16.toBytesSupplementary(codePoint), UTF16);
}
throw new IllegalArgumentException(
format("Not a valid Unicode code point: 0x%X", codePoint));
}
Returns an Optional
containing the nominal descriptor for this instance, which is the instance itself. Returns: an Optional
describing the String instance Since: 12
/**
* Returns an {@link Optional} containing the nominal descriptor for this
* instance, which is the instance itself.
*
* @return an {@link Optional} describing the {@linkplain String} instance
* @since 12
*/
@Override
public Optional<String> describeConstable() {
return Optional.of(this);
}
Resolves this instance as a ConstantDesc
, the result of which is the instance itself. Params: - lookup – ignored
Returns: the String instance Since: 12
/**
* Resolves this instance as a {@link ConstantDesc}, the result of which is
* the instance itself.
*
* @param lookup ignored
* @return the {@linkplain String} instance
* @since 12
*/
@Override
public String resolveConstantDesc(MethodHandles.Lookup lookup) {
return this;
}
}