
Working with strings
In this section, we will look at the various aspects of working with strings or text-based data.
String interpolation
In this section, we will demonstrate the CoffeeScript feature of string interpolation.
In JavaScript, creating strings that include variable values involves concatenating the various pieces together. Consider the following example:
var lineCount = countLinesInFile('application.log'); var message = "The file has a total of " + lineCount + " lines"; console.log(message);
This can get pretty messy and CoffeeScript provides an elegant solution to avoid this called string interpolation.
CoffeeScript provides the ability to perform string interpolation by using double quoted strings containing one or more #{}
delimiters.
The preceding example can be written as follows:
lineCount = countLinesInFile 'application.log' message = "The file has a total of #{lineCount} lines" console.log message
This not only requires less typing, but it can also be easier to read.
String interpolation will evaluate the expression inside the delimiter and its placeholder is replaced by the expression's result.
Consider the following simple expression:
console.log "Simple expressions are evaluated: 5 x 6 = #{ 5 * 6 }"
The output of the preceding expression will be as follows:
Simple expressions are evaluated: 5 x 6 = 30
String interpolation can also evaluate complex expressions as follows:
num = 23 console.log "num is #{ if num % 2 is 0 then 'even' else 'odd' }."
The output of the preceding expression will be as follows:
num is odd.
String interpolation works by evaluating the expression inside the #{}
delimiter and having JavaScript coerce the value into a string. We can control this on our own objects by creating a toString()
function that will be used by the coercion mechanism. By default, coercion for an Object
will display [object Object]
.
In the following example, we create an Employee
class with a toString()
function to override the default coercion value:
class Employee constructor: (@firstName, @lastName, @empNum) -> toString: -> return "#{@firstName} #{@lastName} (No: #{@empNum})"
We can now use an Employee
instance with string interpolation and receive a more valuable result:
employee = new Employee('Tracy', 'Ouellette', 876) console.log "Employee Info: #{employee}"
Its output will be:
Employee Info: Tracy Ouellette (No: 876)
Wrapping text
When working with text, you may need to wrap a long piece of text over a number of lines in order to not exceed the maximum width.
In this section, we will see how to accomplish this using a regular expression.
In the following steps, we create a wrapText()
function that uses a regular expression to split a piece of text at a specified maximum length:
- Define the function as follows:
wrapText = (text, maxLineWidth = 80, lineEnding = '\n') ->
- Create a regular expression instance:
regex = RegExp \".{1,#{maxLineWidth}}(\\s|$)|\\S+?(\\s|$)", 'g'
- Extract matching segments in
text
, join them withlineEnding
, and return the result:text.match(regex).join lineEnding
The wrapText()
function takes a text parameter that represents the text data to be processed and a second optional maxLineWidth
parameter representing the desired maximum width. The maximum width parameter will default to 80 characters if no value is passed. There is another optional parameter allowing you to specify the line ending, which defaults to a new line character.
We create a regular expression instance using the RegExp()
constructor function passing a string interpolated value representing our expression and a modifier.
If we break the regular expression down into its basic blocks, we are requesting segments containing 1
to maxLineWidth
characters {1, maxLineWidth}
, separating each by a whitespace character or the end of the line (\s|$)
. We also provide an additional rule to handle scenarios where there are no whitespace characters within 1 to maxLineWidth
, which will break at the next available whitespace character \S+?(\\s|$)
.
We use the String.match()
function, which takes a regular expression and returns the segment or segments that match the expression. By default, only the first match is returned, which is not what we want in this case. We use the g (global) modifier when we create our RegExp
instance, which will return all matching segments as an array.
Our function ends by calling the Array.join()
function, which will join all of the array elements and separate each one with lineEnding
.
To demonstrate the method in action, we call the wrapText()
method with some sample text from Homer's Odyssey:
homersOdyssey = "He counted his goodly coppers and cauldrons, his gold and all his clothes, but there was nothing missing; still he kept grieving about not being in his own country, and wandered up and down by the shore of the sounding sea bewailing his hard fate. Then Minerva came up to him disguised as a young shepherd of delicate and princely mien, with a good cloak folded double about her shoulders; she had sandals on her comely feet and held a javelin in her hand. Ulysses was glad when he saw her, and went straight up to her." console.log wrapText(homersOdyssey, 40, '<br />\n')
Tip
Notice that we used CoffeeScript's ability to declare a text variable that spans multiple lines. If we use single double quotes, strings that span multiple lines are joined by a space. If we wish to preserve formatting, including line breaks and indentation, we can use triple double quotes """
. Consider the following example:
title = """ <title> CoffeeScript Strings </title> """
This code will produce a string such as <title>\n CoffeeScript Strings\n</title>
.
For the preceding example, the output is as follows:
He counted his goodly coppers and <br /> cauldrons, his gold and all his clothes, <br /> but there was nothing missing; still he <br /> kept grieving about not being in his own <br /> country, and wandered up and down by the <br /> shore of the sounding sea bewailing his <br /> hard fate. Then Minerva came up to him <br /> disguised as a young shepherd of <br /> delicate and princely mien, with a good <br /> cloak folded double about her shoulders; <br /> she had sandals on her comely feet and <br /> held a javelin in her hand. Ulysses was <br /> glad when he saw her, and went straight <br /> up to her.
Truncating text
In this section, we will see how we can truncate text into the desired size without truncating the middle of words.
Truncating text can be handled in much the same way as we handled word wrapping:
- Define your function:
truncateText = (text, maxLineWidth = 80, ellipsis = '...') ->
- Reduce the maximum line width by the length of the ellipsis:
maxLineWidth -= ellipsis.length
- Create your regular expression:
regex = RegExp \ ".{1,#{maxLineWidth}}(\\s|$)|\\S+?(\\s|$)"
- Return the first element of the
match()
result after it has been trimmed with the desired ellipsis:"#{text.match(regex)[0].trim()}#{ellipsis}"
Our truncateText()
function takes a text parameter representing the text data to be truncated and two optional parameters: maxLineWidth
representing the maximum width of the text desired, and ellipsis
representing a string to end our resultant line.
We use the same regular expression as we did in the previous Wrapping text recipe. In this case, however, we reduce the maximum line length by the length of the ellipsis. This will ensure that our result will not exceed the maximum line length.
Because we are not using a regular expression modifier, only the first match is returned.
Consider this example:
homersOdessy = 'He counted his goodly coppers and cauldrons, his gold and all his clothes, but there was nothing missing;' console.log truncateText homersOdessy, 30
The output for this code will be:
He counted his goodly...
Converting character casing
In this recipe, we will demonstrate how to convert text from one casing scheme to another:
- Sentence case, for example, This is an example of sentence case
- Title case, for example, This Is an Example of Title Case
- Pascal case, for example, PascalCase
- Camel case, for example, camelCase
- Snake case, for example, snake_case
We will define our case conversion methods as a utility module that we can use for any application:
- Create a constant array with the list of those words that are not capitalized within titles:
WORD_EXCEPTIONS_FOR_TITLECASE = \ ['a','an','and','but','for','nor','or','the']
- Create some helper methods to split words on whitespace or capitalization and another to capitalize the first letter of the word:
capitalizeWord = (word) -> word[0].toUpperCase() + word[1..].toLowerCase() upperSplit = (item) -> words = [] word = '' for char in item.split '' if /[A-Z]/.test char words.push word if word.length word = char else word += char words.push word if word.length return words splitStringIntoTokens = (text) -> results = [] for token in text.split /[ _]+/ token = token.trim() words = upperSplit token for word in words results.push word.toLowerCase() results
- Create a function to return a string in title case:
toTitleCase = (text, wordsToIgnore = WORD_EXCEPTIONS_FOR_TITLECASE) -> words = splitStringIntoTokens text words[0] = capitalizeWord words[0] for word, index in words[1..] unless word in wordsToIgnore words[index+1] = capitalizeWord word words.join ' '
- Create a function to return a string in sentence case:
toSentenceCase = (text) -> words = splitStringIntoTokens text words[0] = capitalizeWord words[0] words.join ' '
- Create a function to return a string in snake case:
toSnakeCase = (text) -> splitStringIntoTokens(text).join '_'
- Create a function to return a string in Pascal case:
toPascalCase = (text) -> (capitalizeWord word for word in splitStringIntoTokens(text)).join ''
- Create a function to return a string in camel case:
toCamelCase = (text) -> text = toPascalCase text text[0].toLowerCase() + text[1..]
- Assign your functions to the
module.exports
object so they are made available to your applications:module.exports = toSentenceCase: toSentenceCase toTitleCase: toTitleCase toPascalCase: toPascalCase toCamelCase: toCamelCase toSnakeCase: toSnakeCase
The module starts with a capitalizeWord()
method that takes a single word as a parameter and returns the word capitalized. For example, capitalizeWord 'hello'
returns Hello
.
The splitStringIntoTokens()
method is the workhorse of our module and is responsible for breaking up a string of text into various words. For sentences, this is easily accomplished by splitting the string by spaces. We also want to be able to parse text that contains Pascal and camel case words. This will allow us to convert from Pascal case to snake case, camel case, and so on. We accomplish this by passing each token (word) to the inner upperSplit()
method, which reviews the letters of each word, looking for an uppercase value representing the start of a new word.
The splitStringIntoTokens 'Hello world'
annotation will return an array containing two words ['hello', 'world']
. splitStringIntoTokens 'HelloWorld'
. Notice that the words are all lowercase. This helps to normalize the tokens for later processing.
The following methods are responsible for using the individual words that have been split from the text provided and returning the text in the various casing formats. Each takes a single parameter representing the text to be parsed. The toTitleCase()
function takes an optional array of words to ignore when performing title case conversion. If no array is provided, the default WORD_EXCEPTIONS_FOR_TITLECASE
array is used.
We finish by exporting toTitleCase()
, toSentenceCase()
, toPascalCase()
, toCamelCase()
, and toSnakeCase()
as the public API for our casing utility module.
The following code is a small application to demonstrate our casing module:
caseUtils = require './casing_utils' console.log 'Title:', caseUtils.toTitleCase 'an author and his book' console.log 'Sentence:', caseUtils.toSentenceCase 'this should be in sentence case' console.log 'Pascal:', caseUtils.toPascalCase 'this should be in pascal case' console.log 'Camel:', caseUtils.toCamelCase 'this should be in camel case' console.log 'Snake:', caseUtils.toSnakeCase 'this should be in snake case'
The output for this code is as follows:
Title: An Author and His Book Sentence: This should be in sentence case Pascal: ThisShouldBeInPascalCase Camel: thisShouldBeInCamelCase Snake: this_should_be_in_snake_case
Using regular expressions
Regular expressions can be used when working with text data and provide a powerful tool to process text. This is accomplished by passing or using processing instructions to the various methods that accept regular expressions as parameters or by executing the regular expression directly.
We have already seen regular expressions used to split strings and test a value. These can be used as parameters to the split()
and replace()
methods. In these cases, the regular expression is used as a matcher.
Let's look at how we can utilize regular expressions using split()
, replace()
, and test()
:
# SPLIT() USING A REGULAR EXPRESSION whiteSpaceRegex = /[\s]/ words = "A happy\tday\nis here" console.log "Value:", words console.log (words.split whiteSpaceRegex) # REPLACE() USING A REGULAR EXPRESSION phrase = 'The blue balloon is bright' console.log "Red balloon:", (phrase.replace /blue/, 'red') # TEST() USING A REGULAR EXPRESSING validIpAddress = '192.168.10.24' invalidIpAddress = '192.168-10.24' testRegex = /\d+\.\d+\.\d+\.\d+/ console.log "#{validIpAddress} valid?", (testRegex.test validIpAddress) console.log "#{invalidIpAddress} valid?", (testRegex.test invalidIpAddress)
The following example uses a regular expression to split a string on whitespaces \s
including spaces, tabs, newlines, and others. Note that the regular expression is enclosed in two forward slashes /
.
The output for the preceding example is:
Value: A happy day is here [ 'A', 'happy', 'day', 'is', 'here' ]
In the replace()
example, we replace all instances of blue
with red
. This updates our phrase to The red balloon is bright
.
By default, regular expressions are case sensitive. You can make the matching pattern case insensitive by adding the \i
modifier. For example, "It's a Wonderful Life".replace /life/i, "Book"
will return It's a Wonderful Book
.
You can use the RegExp test()
method to see whether a string matches the regular expression pattern. In our example, we have two IP addresses, one that is valid and one that is not. We have a pattern that represents a sequence of four numbers separated by periods. Our invalid IP address uses a hyphen.
192.168.10.24 valid? true 192.168-10.24 valid? False
Tip
Note that our test for IP address that the IP address consists of four positive integers separated by periods. To validate that each segment is between 0 and 255, we can use the following regular expression:
/(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[09][0-9]?)/
There are many great online resources to learn more about regular expressions including the following:
- A full overview of regular expressions from the Mozilla Developer Network at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
- An interactive regular expression tester at http://regex101.com
- A regular expression visualization tool at http://www.regexper.com