The jsChunkEx library is a JavaScript implementation of a feature of xTalk languages known as chunk expressions. Chunk expressions are used to manipulate text strings using natural-language concepts such as characters, words, items, sentences, lines, and paragraphs.
Click each button to see how a body of text is split into chunks below. Hover over each chunk to reveal its descriptor (type and index). Chunk expressions, originating from natural language, start at index 1.
Everyone is entitled to all the rights and freedoms set forth in this Declaration, without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status. Furthermore, no distinction shall be made on the basis of the political, jurisdictional or international status of the country or territory to which a person belongs, whether it be independent, trust, non-self-governing or under any other limitation of sovereignty.
How To Use
To use jsChunkEx, embed the following script tag on your page:
<script type="text/javascript" src="http://www.kreativekorp.com/lib/jsChunkEx/jsChunkEx.js"></script>
Or, if you prefer to host it yourself, download jsChunkEx.zip.
Call jsChunkEx using any of the following APIs:
jsChunkEx.countChunks(text, descriptors, chunkType)
- Count the number of chunks in a string.
jsChunkEx.splitChunks(text, descriptors, chunkType)
- Split a string into an array of its constituent chunks.
jsChunkEx.findChunk(text, descriptors)
- Determine the start and end offset of a chunk in a string.
jsChunkEx.findChunkToDelete(text, descriptors)
- Determine the start offsets of a chunk and the following chunk.
jsChunkEx.getChunk(text, descriptors)
- Return the substring corresponding to a chunk of a string.
jsChunkEx.deleteChunk(text, descriptors)
- Return a string with the specified chunk removed.
jsChunkEx.replaceChunk(text, descriptors, replacement)
- Return a string with a chunk replaced with another string.
jsChunkEx.prependToChunk(text, descriptors, replacement)
- Return a string with a chunk prepended with another string.
jsChunkEx.appendToChunk(text, descriptors, replacement)
- Return a string with a chunk appended with another string.
Where descriptors
is a series of arguments of the following forms:
chunkType, index
- a single chunk at the given index
chunkType, startIndex, endIndex
- a range of chunks between two indices
chunkType, jsChunkEx.BY_CONTENT, stringToMatch
- a chunk matching a given string
chunkType, jsChunkEx.BY_CONTENT, regExpToMatch
- a chunk matching a given regular expression
jsChunkEx.LINE_ENDING, lineEnding
- Change the line ending used by the
jsChunkEx.LINE
chunk type. jsChunkEx.ITEM_DELIMITER, itemDelimiter
- Change the delimiter used by the
jsChunkEx.ITEM
chunk type. jsChunkEx.COLUMN_DELIMITER, columnDelimiter
- Change the delimiter used by the
jsChunkEx.COLUMN
chunk type. jsChunkEx.ROW_DELIMITER, rowDelimiter
- Change the delimiter used by the
jsChunkEx.ROW
chunk type.
And chunkType
is one of the following:
jsChunkEx.CHARACTER
- A Unicode character.
jsChunkEx.WORD
- Sequences of non-whitespace characters separated by whitespace.
jsChunkEx.ITEM
- Sequences of characters delimited by the
jsChunkEx.ITEM_DELIMITER
. jsChunkEx.SENTENCE
- Sequences of characters ending with a period, exclamation point, or question mark.
jsChunkEx.LINE
- Sequences of characters delimited by newline, carriage return, CRLF, or the Unicode line separator or paragraph separator character.
jsChunkEx.PARAGRAPH
- Sequences of characters separated by newline, carriage return, or Unicode line separator or paragraph separator characters.
jsChunkEx.COLUMN
- Sequences of characters delimited by the
jsChunkEx.COLUMN_DELIMITER
. jsChunkEx.ROW
- Sequences of characters delimited by the
jsChunkEx.ROW_DELIMITER
.
And index
is a positive integer for a chunk counted from the beginning of a string,
a negative integer for a chunk counted from the end of a string, or one of the following special values:
jsChunkEx.ANY
- A random chunk.
jsChunkEx.FIRST
- The first chunk.
jsChunkEx.MIDDLE
- The middle chunk.
jsChunkEx.LAST
- The last chunk.
Examples
jsChunkEx.countChunks('Hello, my name is Rebecca.', jsChunkEx.WORD)
- returns
5
jsChunkEx.getChunk('Hello, my name is Rebecca.', jsChunkEx.WORD, 3)
- returns
"name"
jsChunkEx.countChunks('Hello, my name is Rebecca.', jsChunkEx.WORD, 3, jsChunkEx.CHARACTER)
- returns
4
jsChunkEx.deleteChunk('Hello, my name is Rebecca.', jsChunkEx.WORD, 2, 4)
- returns
"Hello, Rebecca."
jsChunkEx.replaceChunk('Hello, my name is Rebecca.', jsChunkEx.WORD, jsChunkEx.LAST, 'Ginny.')
- returns
"Hello, my name is Ginny."
By all means feel free to play around with jsChunkEx on JSFiddle.