Upload
oscar-prieto-blanco
View
223
Download
0
Embed Size (px)
DESCRIPTION
clojure numbers
Citation preview
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 1/8
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 2/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 2/8
clojure.lang.Ratio
java.lang.Number
java.lang.Integer
java.lang.Long
java.math.BigInteger
java.math.BigDecimal
java.lang.Float
java.lang.Double
Not surprisingly, most of these are just Java types. However, two more types are
added: BigInt
(https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/BigInt.java)
and Ratio
(https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Ratio.java).
Both are weird. I'd like to focus a bit on Ratio . Ratio can be created by integerdivision, but only in case the division can not produce an integer:
;; Aight
user> (type (/ 1 2))
clojure.lang.Ratio
;; Not really expecting this
user> (type (/ 1 1))
java.lang.Long
;; Yeah, well, WAIT WHAT
user> (type (/ 1N 1M))
java.math.BigDecimal
We can also just call the Ratio constructor (and fail miserably in some cases):
;; Cool
user> (type 1/2)
clojure.lang.Ratio
;; Eh?
user> (clojure.lang.Ratio. 1 1)
ClassCastException java.lang.Long cannot be cast to java.math.BigInteger user/eval21314 (form‐in
it5235971328632709373.clj:1)
;; Ah!
user> (clojure.lang.Ratio. (biginteger 1) (biginteger 1))
1/1
The proper way is to coerce the parameters to java.math.BigInteger . Why? Historical
reasons: clojure.lang.Ratio only accepts java.math.BigInteger because back when
it was written Clojure didn't have clojure.lang.BigInt type and no‐one touched the
code since quite literally forever
(https://github.com/clojure/clojure/commits/master/src/jvm/clojure/lang/Ratio.java).
The fun train doesn't stop here. For example, we may want to create a ratio with a
denominator of 0. Let's try the usual way:
1
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 3/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 3/8
;; Good
user> 1/0
ArithmeticException Divide by zero clojure.lang.Numbers.divide (Numbers.java:158)
;; Consistent!
user> (/ 1 0)
ArithmeticException Divide by zero clojure.lang.Numbers.divide (Numbers.java:158)
Bummer. But then again it might make sense, after all a Ratio with a denominator
value 0 may result in some weird math occurring. But we haven't tried all the available
constructors yet, so let's do that:
;; I hate this :/
user> (clojure.lang.Ratio. (biginteger 1) (biginteger 0))
1/0
WAIT WHAT.
Combining java.math.BigInteger with clojure.lang.Ratio is even more fun,especially when it comes to corner cases:
;; Alright makes sense
user> (.denominator (* 7919/7920 (/ 1 Long/MAX_VALUE)))
73049106531889824391440
user> (class (.denominator (* 7919/7920 (/ 1 Long/MAX_VALUE))))
java.math.BigInteger
;; WAIT BUT WHY
user> (/ 7919 (* 7919/7920 (/ 1 Long/MAX_VALUE)))
73049106531889824391440N
user> (class (/ 7919 (* 7919/7920 (/ 1 Long/MAX_VALUE))))clojure.lang.BigInt
The result type differs while logically you performed the exact same computation. And
don't forget that those types are not always cooperating nicely, so you introduce more
corner cases. Oh boy!
Who wears Cheetah?Leaking abstractions is not cool. Clojure tries to present leaking abstractions as afeature. This is doubly not cool.
Number type promotion is not cool if there's no clear way to demote type. It's doubly
not cool in Clojure, because there's no clear documentation on how and when
promotion works. Existing documentation is lacking at best
(http://clojure.org/data_structures#Data%20Structures‐Numbers).
Consistency is great. Clojure is not great at consistency though and sometimes it feels
like the "the principle of least astonishment" is being pro‐actively broken by Clojure'sdesign in the numbers domain.
Here's an incomplete and perhaps redundant list of things that I find annoying,
surprising or outright stupid in Clojure:
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 4/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 4/8
Arithmetic overflows everywhere! Multiplying java.lang.Integer will never
cause overflow, however java.lang.Long will fail to be autopromoted. To be
fair, this behavior is right there in the docstring for * but then again, who reads
docstsring for multiplication? There's also *' , +' and ‐' , all of which auto‐
promote the result, but what are the chances you ever even knew about them?
clojure.lang.Ratio uses java.math.BigInteger and not clojure.lang.BigInt
for numerator and denominator. Why? Because when Ratio was created (back in
2010) clojure.lang.BigInt simply didn't exist and when it was finally created,
Ratio was not updated to represent the change. Bonus points for figuring out
why clojure.lang.BigInt was created in the first place.
Floats and doubles are... Well, the same floats and doubles as in Java. There's no
attempt to hide them away. So, things like infinity and NaN are there, but they're
not really supported by Clojure. How does one check if the number is NaN or
Infinity in Clojure? You use java.lang.Float or java.lang.Double classes for
that, specifically static methods such as isNaN , isFinite , etc. Hardly a portable
solution.
Documentation is bad. Like, terribad. We're talking about a language with 8 years
of development history, with strong backing from commercial companies, with
successful commercial and open‐source products written in the languge and yet
we see very little focus on documenting things, even essential things, numbers
being one of them.
Unsigned math is not supported. There's nothing in Java, thus there's nothing in
Clojure. Make what you want out of it.
Bit operations do not belong in core namespace. It's clutter, most programs don't
need them. More than that they're simply broken. More on that in a few bits.
*unchecked‐math* is one big can of worms and can quite literally screw up your
library performance or even behavior when someone using your library sets said
dynamic var.
So, bit operations. Clojure really lets you down here and you as a programmer would
have to extremely careful to avoid the common pitfalls. Most recovering C and C++
addicts would say that bit shift to the left by one bit is equal to multiplying by 2.
Clojure says NO. Unless you multiply by a different kind of two:
user> (bit‐shift‐left Long/MAX_VALUE 1)
‐2
user> (* 2 Long/MAX_VALUE)
ArithmeticException integer overflow clojure.lang.Numbers.throwIntOverflow (Numbers.java:1501)
user> (* 2N Long/MAX_VALUE)
18446744073709551614N
The first behavior is a result of lacking proper unsigned, modular number type. The
exception in the second is the result of "protecting" the users from overflowing,
instead of promoting the type (as expected). And then the third one does the right
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 5/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 5/8
thing. Or maybe a wrong thing, but in any case you would expect all 3 functions to do
the same thing. What's worse is that there are plenty similar examples. Predictability is
important, people!
I wanna look tan
Even though no‐one asked me, I'll try to imagine a better world of Clojure math. Firstoff, the number types. There should be only two ways to represent numbers in Clojure:
integers and reals. Integers should be signed and unbounded. Integer division always
produces reals WITHOUT EXCEPTIONS. Integers can be promoted to reals, but reals can
never be demoted to integers. Reals can follow the same approach as
java.lang.BigDecimal
(http://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html), Python's
decimal (https://docs.python.org/2/library/decimal.html) module or MPFR
(http://www.mpfr.org/mpfr‐current/mpfr.html). Now, obviously you'll immediatelyfind a problem with this approach, namely that you need a proper context for all
decimal operations. I say, default to large, and I mean LARGE precision. As in, precision
that doesn't even make sense anymore, like 2^20. Let people control the precision
through a context. Leave only basic math operations in core namespace: addition,
subtraction, multiplication and division. Define those operations clearly, make sure
that division always produces reals and truncate where need be.
Then, introduces math namespace. Put modular math operations in math.modular .
math.binary for binary math, bit shifting. math.real containing functions andmacroses helping with handling real context, rounding, etc. math.ratio for, well
Ratio. math.float for IEEE 754‐2008 floating point numbers. math.platform.jvm and
math.platform.js for exposing platform‐specific numbers.
But most importantly, write documentation. Everything has to be documented
extensively and clearly, without exceptions. Great code and great design is only half
the battle, clear documentation is the other.
As far as negative impact of said change, I can only think of performance. But only asmall minority of Clojure users type‐hints everything or uses Zachary Tellman's
primitive‐math (https://github.com/ztellman/primitive‐math). Everyone else? They
get to enjoy the math setup that has very questionable decision baked into it without
worrying much about the performance.
Let me take a selfieI tend to complain. A lot. The math in Clojure is just one of my complaint targets.
However, it's a valid target. The math is neglected in Clojure, I see no attention being
paid to it by core developers, there's no organized effort to make it better, there has
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 6/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 6/8
been zero calls to community to ask for improvement ideas. And what's worse is that
this math is completely ingrained into Clojure core namespace and you can't replace it
easily. There's no way to fix the numbers in Clojure from the outside.
I could take on this, spend plenty of time writing the proposal pushing it to Clojure
core, writing the code afterwards, push for solution, but what are the chances that it's
ever going to get accepted? The upfront cost of this work is tremendous and there's
very little chance that such work would ever end up in Clojure core.
1. Where "literally forever" is used in terms of Internet age.
6 Comments 1
Alex Miller •
I think there are some valid points in here and tickets or patches would be welcome for them. We
have fixed a number of math-related issues over the last few releases.
It's important to understand that there is a design at work here, and that design is layered over
Java's (different) design. Inevitably, you will see some points of disconnect. I don't feel that you
captured the intent of Clojure's design in this post though. Numerics are inherently tricky in the
tension between fixed precision (for speed) and arbitrary precision (for flexibility and accuracy). Javahas an additional level of complexity due to having both primitive and "boxed" Object versions of the
primitives.
Clojure represents numbers as one of: long (64-bit two's complement signed integer), double
(double-precision 64-bit IEEE 754 floating point), biginteger (arbitrary-precision integers), bigdecimal
(arbitrary-precision floating point), and ratios. Ratios represent a ratio between (big)integers - the
arbitrary precision type is used to represent ratios that could not otherwise be represented.
Clo jure uses "contagion" when computing arithmetic operations between a mixture of types and will
produce a result with the greater precision.
When performing arithmetic on fixed precision numbers and thinking about overflow/underflow, you
have three options: 1) ignore and overflow/underflow (you can get this behavior in Clojure with the
"unchecked" family of operations - these are fast and at times useful for things like hash
computations), 2) check and throw exceptions (the default - fast but safe), and 3) check and
autopromote (you can get this behavior with the '-suffix ops - +', -', etc). All of these have utility
depending on what you are doing. Clojure chooses to be fast and safe by default.
The article mentions java.lang.Integer and java.lang.Float but these are never used in Clojure code.
Conversion support is provided (to/from Long and Double respectively) for interop with Java apis as
needed, but these are not valid types as far as Clojure is concerned.
Clojure's BigInt class is different from java.math.BigInteger in that it retains long semantics while in
long range but takes on BigInteger semantics when past long range - this is a performance
o timization as well as a hook for Clo ure-s ecific hash values and e ualit com arisons.
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 7/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 7/8
• •
, .
Balancing all of the tradeoffs between the already complicated Java landscape and Clojure's view of
numerics along with a variety of usages with different performance needs is challenging. A variety of
options were extensively considered before reaching the current design, which provides reasonable
performance for most common operations and the possibility to be more specific when you have
different needs.
Some other useful links in the docs:
http://clojure.org/data_struct...
http://clojure.org/java_intero...
Many of the things you present as problems seem like a misunderstanding of the design and
tradeoffs to me. Others are possibly areas for improvements and we'd be happy to discuss them
(from a problem-solving perspective) on the mailing lists.
• •
mishok13 • Mod
Alex, first of all I appreciate you taking the time to go through this rant. It is, as I mentioned in
the beginning, an angry post mostly because I got bitten in the ass by most of these corner
cases (you prefer the name "design decisions" :) ) presented here.
It is my strong opinion is that these decisions are questionable at best and harmful at worst.
A lot of examples provided in the blog post are watered down extracts from actual code and
are worked around by having type-checking conditions around them. As far as the
performance argument of having this zoo of types (including primitive ones) I think it's
flawed. Clojure is not a performance beast and anyone needing it to be fast is better off with
Zachary Tellman's libraries like primitive-math or immutable-bit-set. To extend my thought
further: I think it's the responsibility of the compiler to make things performant, it's the
responsibility of language designer to make non-leaking abstractions and yet provide tools to
break through abstraction layers.
As for the documentation links you provide all I can do is quote the post itself "Existing
documentation is lacking at best.". I don't think that a page of text briefly covering numbers is
simply not enough. And trying to find motivation behind a lot of the decisions taken in Clojure
is a daunting task, since most code is just that, code without any explanation as to *why*
things work the way they do.
• •
mishok13 • Mod
One nitpick from your response that I think is pretty funny: you shouldn't claim that
ava.lang.Integer and java.lang.Float are not used in the Clojure and then link to Java interop
page that explicitly uses "int". :)
Mike •
It's a bit unfair to say that there has been zero interest in improvement ideas. Lots of people have
worked on this, and I think Alex has done a good job of explaining the design and some of the trade-offs involved.
In particular I believe it is impossible for single approach to achieve both of the following:
a) High performance (as you get with Java primitive maths operations)
7/21/2019 Clojure, Numbers, Despair
http://slidepdf.com/reader/full/clojure-numbers-despair 8/8
31/10/2015 Clojure, numbers, despair
http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 8/8
• •
genera purpose numer ca ower suppor ng ar rary prec s on an mu p e represen a ons
Clojure actually provides both, but you need to use different functions (unchecked primitive
operations vs. the auto-promoting function variants)
One other thing you should consider is that for serious numerical work, you won't want to use the
standard functions anyway: it is much more likely that you will want to use core.matrix backed with
an implementation that supports the specific types you need (e.g. vectorz-clj for N-dimensional
double-precision primitive arrays). That will be much faster and more idiomatic than attempting toroll your own numerical code using the standard Clojure numerical operations.
• •
mishok13 • Mod
Mike, thanks for taking the time to write a response. I don't think you understand the design I
have laid out in the post. I'm not arguing for a single approach to everything, I'm arguing that
a "general purpose numerical tower" should be simplified, stripped of or rather abstracted
from needless Java artifacts and be documented as extensively as possible. Beyond that
Clojure can provide access to platform-specific, performant types and routines operating
with said types. As you said, Clojure provides both but they're mixed between each other
leading to a lot of confusion.
And your point about serious numerical work only enforces my opinion that Clojure should
hide platform specific types, bit shifting functions and lots of other ops from core language. If
you are not going to use them for serious numerical work, why have them clutter the core
namespace?
Mike •
I understand your proposal, I just think there are quite a few problematic aspects:
a) It will break backwards compatibility, so virtually zero chance of adoption at this
point given Clojure/core propensities ;-)
b) It will mean people being forced to use different operations like modular/+ and
real/+ based on the types of arguments. That will make code become ugly very
fast.... and how do you handle the very common case of mixed arguments?
c) Platform-specific namespaces seem like a code smell to me. For core numerical
functionality, I'd expect the names to be cross-platform compatible (even if the
underlying implementation may differ)
d) Ultimately, numerical operations are pretty core to any language. I think most
people would expect them to be in clojure.core, and "just work". The exact form of
these standard operations could certainly be debated, but I think the current ones are
at least a reasonable compromise between performance and generality.
I guess someone could implement these new namespaces as a library. It would
probably be pretty useful for people feeling the same kind of pain points. FWIW, I've
been annoyed by clojure.core numerical behaviour many times, but there is usually a
workaround so it has never become painful enough for me to want a separate library.
The call for better documentation I wholeheartedl su ort of course :-