With Crystal 0.28.0 we have a new feature for formatting numbers for human readers.
Previously the options were using #to_s
on various Number
types or at best sprintf
. Both provide only limited output formats and they’re focused on how numbers are represented for computers. They don’t have readability for humans in mind.
When showing numbers in a user interface, they need to be understandable by human readers.
Format a Number
Meet the new Number#format
method.
It allows printing numbers in a customizable format, that can represent the way that numbers are usually written for humans.
Number styles
Numbers can be formatted using configurable decimal separator and thousands delimiter:
123_456.789.format('.', ',') # => "123,456.789"
123_456.789.format(',', '.') # => "123.456,789"
123_456.789.format(',', ' ') # => "123 456,789"
123_456.789.format(',', '\'') # => "123'456,789"
The number of digits in a thousands group is also configurable. This works for example for Chinese numbers grouped by tenthousands:
123_456.789.format('.', ',', group: 4) # => "12,3456.789"
There are many different styles used in different cultural contexts, and this method is flexible enough to represent most common formats.
How the world separates its digits provides an overview of international styles, and the Wikipedia article on Decimal Separators provides some more insight on this topic.
Decimal places
Floating point numbers can produce a lot of decimal places when converted to a human-readable string. For user output such detail is usually a distraction and displaying a few decimal places is plenty.
The number of decimal places can be configured directly in the #format
method:
123_456.789.format(decimal_places: 2) # => "123,456.79"
123_456.789.format(decimal_places: 0) # => "123,457"
123_456.789.format(decimal_places: 4) # => "123,456.7890"
Compared to rounding the value manually before formatting it, this is easier and allows for more options.
The number of decimal places is fixed by default. Trailing zeros will only be omitted when only_significant
is true
:
123_456.789.format(decimal_places: 6) # => "123,456.789000"
123_456.789.format(decimal_places: 6, only_significant: true) # => "123,456.789"
Humanize a Number
When numbers of different orders of magnitude are put in relation, it’s difficult to represent a large range of values in a meaningful way.
In such cases, it’s common to express the magnitude of a value using a quantifier.
For this we have Number#humanize
: It rounds the number to the nearest thousands magnitude with a specific number of significant digits.
1_200_000_000.humanize # => "1.2G"
0.000_000_012.humanize # => "12.0n"
It has the same arguments for decimal separator
and thousands delimiter
as Number#format
, so the style is configurable exactly the same way.
The number of significant digits can be adjusted by precision
. But the default value 3
is probably already a good fit for most applications.
When siginficant
is true
, the value of precision
is the fixed amount of decimal digits regardless of the number’s value.
Quantifiers are by default the SI prefixes (k
, M
, G
, etc.), but they’re completely configurable, either by providing a list, or a proc.
Customizable quantifiers
Number#humanize
can take a proc argument that calculates the number of digits and the quantifier for a specific magnitude.
The following example shows how to format a length in metric units, including the unit designator. It derives from the default implementation by using the common centimeter unit for values between 0.01
and 0.99
(which the generic mapping would express as millimeter). All other values use the generic SI prefixes (provided by Number.si_prefix
).
def humanize_length(number)
number.humanize do |magnitude, number|
case magnitude
when -2, -1 then {-2, " cm"}
else
magnitude = Number.prefix_index(magnitude)
{magnitude, " #{Number.si_prefix(magnitude)}m"}
end
end
end
humanize_length(1_420) # => "1.42 km"
humanize_length(0.23) # => "23.0 cm"
humanize_length(0.05) # => "5.0 cm"
humanize_length(0.001) # => "1.0 mm"
Humanize Bytes
The third method is Int#humanize_bytes
which allows formatting a number of bytes (for example memory size) in a typical format. It supports both IEC (Ki
, Mi
, Gi
, Ti
, Pi
, Ei
, Zi
, Yi
) and JEDEC (K
, M
, G
, T
, P
, E
, Z
, Y
) prefixes.
1.humanize_bytes # => "1B"
1024.humanize_bytes # => "1.0kiB"
1536.humanize_bytes # => "1.5kiB"
524288.humanize_bytes(format: :JEDEC) # => "512kB"
1073741824.humanize_bytes(format: :JEDEC) # => "1.0GB"
The implementation of this method is another example for a custom format based on Numer#humanize
.
Summary
These new methods provide great features for making numbers look pretty to the reader.
They do not provide style mappings for specific locales. This is a non-trivial task that should be left for dedicated I18N libraries. But they’re useful building blocks that such libraries can build upon. And they’re immediatetly usable when you don’t need to support different locales.
The implementation is not perfect, though. Localization is complex and hard to get right. As always, the devil lies in the details. For example, the thousands delimiter and group size are configurable, but have fixed values. The Indian numbering system can’t be represented in this way. Then only arabic numbers are supported. And there are probably lots of other cases which would require more specialiced behaviour.
But it’s probably good for more than 90% of typical use cases, and already useful in many places. And there is always room for improvement.
More background information can be found in the PR which brought these features.
Also a good read on formatting numbers from a more general perspective: Formatting numbers for machines and mortals by Hjalmar Gislason.
With Crystal 0.28.0 we have a new feature for formatting numbers for human readers.
Previously the options were using
#to_s
on variousNumber
types or at bestsprintf
. Both provide only limited output formats and they’re focused on how numbers are represented for computers. They don’t have readability for humans in mind.When showing numbers in a user interface, they need to be understandable by human readers.
Format a Number
Meet the new
Number#format
method.It allows printing numbers in a customizable format, that can represent the way that numbers are usually written for humans.
Number styles
Numbers can be formatted using configurable decimal separator and thousands delimiter:
The number of digits in a thousands group is also configurable. This works for example for Chinese numbers grouped by tenthousands:
There are many different styles used in different cultural contexts, and this method is flexible enough to represent most common formats.
How the world separates its digits provides an overview of international styles, and the Wikipedia article on Decimal Separators provides some more insight on this topic.
Decimal places
Floating point numbers can produce a lot of decimal places when converted to a human-readable string. For user output such detail is usually a distraction and displaying a few decimal places is plenty.
The number of decimal places can be configured directly in the
#format
method:Compared to rounding the value manually before formatting it, this is easier and allows for more options.
The number of decimal places is fixed by default. Trailing zeros will only be omitted when
only_significant
istrue
:Humanize a Number
When numbers of different orders of magnitude are put in relation, it’s difficult to represent a large range of values in a meaningful way.
In such cases, it’s common to express the magnitude of a value using a quantifier.
For this we have
Number#humanize
: It rounds the number to the nearest thousands magnitude with a specific number of significant digits.It has the same arguments for decimal
separator
and thousandsdelimiter
asNumber#format
, so the style is configurable exactly the same way.The number of significant digits can be adjusted by
precision
. But the default value3
is probably already a good fit for most applications. Whensiginficant
istrue
, the value ofprecision
is the fixed amount of decimal digits regardless of the number’s value.Quantifiers are by default the SI prefixes (
k
,M
,G
, etc.), but they’re completely configurable, either by providing a list, or a proc.Customizable quantifiers
Number#humanize
can take a proc argument that calculates the number of digits and the quantifier for a specific magnitude.The following example shows how to format a length in metric units, including the unit designator. It derives from the default implementation by using the common centimeter unit for values between
0.01
and0.99
(which the generic mapping would express as millimeter). All other values use the generic SI prefixes (provided byNumber.si_prefix
).Humanize Bytes
The third method is
Int#humanize_bytes
which allows formatting a number of bytes (for example memory size) in a typical format. It supports both IEC (Ki
,Mi
,Gi
,Ti
,Pi
,Ei
,Zi
,Yi
) and JEDEC (K
,M
,G
,T
,P
,E
,Z
,Y
) prefixes.The implementation of this method is another example for a custom format based on
Numer#humanize
.Summary
These new methods provide great features for making numbers look pretty to the reader.
They do not provide style mappings for specific locales. This is a non-trivial task that should be left for dedicated I18N libraries. But they’re useful building blocks that such libraries can build upon. And they’re immediatetly usable when you don’t need to support different locales.
The implementation is not perfect, though. Localization is complex and hard to get right. As always, the devil lies in the details. For example, the thousands delimiter and group size are configurable, but have fixed values. The Indian numbering system can’t be represented in this way. Then only arabic numbers are supported. And there are probably lots of other cases which would require more specialiced behaviour.
But it’s probably good for more than 90% of typical use cases, and already useful in many places. And there is always room for improvement.
More background information can be found in the PR which brought these features.
Also a good read on formatting numbers from a more general perspective: Formatting numbers for machines and mortals by Hjalmar Gislason.