8
7/21/2019 Clojure, Numbers, Despair http://slidepdf.com/reader/full/clojure-numbers-despair 1/8

Clojure, Numbers, Despair

Embed Size (px)

DESCRIPTION

clojure numbers

Citation preview

Page 1: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 1/8

Page 2: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 2/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 2/8

clojure.lang.Ratio

java.lang.Number

java.lang.Integer

java.lang.Long

java.math.BigInteger

java.math.BigDecimal

java.lang.Float

java.lang.Double

Not surprisingly, most of these are just Java types. However, two more types are

added: BigInt

(https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/BigInt.java)

and Ratio

(https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Ratio.java).

Both are weird. I'd like to focus a bit on Ratio . Ratio can be created by integerdivision, but only in case the division can not produce an integer:

;; Aight

user> (type (/ 1 2))

clojure.lang.Ratio

;; Not really expecting this

user> (type (/ 1 1))

java.lang.Long

;; Yeah, well, WAIT WHAT

user> (type (/ 1N 1M))

java.math.BigDecimal

We can also just call the Ratio constructor (and fail miserably in some cases):

;; Cool

user> (type 1/2)

clojure.lang.Ratio

;; Eh?

user> (clojure.lang.Ratio. 1 1)

ClassCastException java.lang.Long cannot be cast to java.math.BigInteger user/eval21314 (form‐in

it5235971328632709373.clj:1)

;; Ah!

user> (clojure.lang.Ratio. (biginteger 1) (biginteger 1))

1/1

The proper way is to coerce the parameters to java.math.BigInteger . Why? Historical

reasons: clojure.lang.Ratio only accepts java.math.BigInteger because back when

it was written Clojure didn't have clojure.lang.BigInt type and no‐one touched the

code since quite literally forever

(https://github.com/clojure/clojure/commits/master/src/jvm/clojure/lang/Ratio.java).

The fun train doesn't stop here. For example, we may want to create a ratio with a

denominator of 0. Let's try the usual way:

1

Page 3: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 3/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 3/8

;; Good

user> 1/0

ArithmeticException Divide by zero clojure.lang.Numbers.divide (Numbers.java:158)

;; Consistent!

user> (/ 1 0)

ArithmeticException Divide by zero clojure.lang.Numbers.divide (Numbers.java:158)

Bummer. But then again it might make sense, after all a Ratio with a denominator

value 0 may result in some weird math occurring. But we haven't tried all the available

constructors yet, so let's do that:

;; I hate this :/

user> (clojure.lang.Ratio. (biginteger 1) (biginteger 0))

1/0

WAIT WHAT.

Combining java.math.BigInteger with clojure.lang.Ratio is even more fun,especially when it comes to corner cases:

;; Alright makes sense

user> (.denominator (* 7919/7920 (/ 1 Long/MAX_VALUE)))

73049106531889824391440

user> (class (.denominator (* 7919/7920 (/ 1 Long/MAX_VALUE))))

java.math.BigInteger

;; WAIT BUT WHY

user> (/ 7919 (* 7919/7920 (/ 1 Long/MAX_VALUE)))

73049106531889824391440N

user> (class (/ 7919 (* 7919/7920 (/ 1 Long/MAX_VALUE))))clojure.lang.BigInt

The result type differs while logically you performed the exact same computation. And

don't forget that those types are not always cooperating nicely, so you introduce more

corner cases. Oh boy!

Who wears Cheetah?Leaking abstractions is not cool. Clojure tries to present leaking abstractions as afeature. This is doubly not cool.

Number type promotion is not cool if there's no clear way to demote type. It's doubly

not cool in Clojure, because there's no clear documentation on how and when

promotion works. Existing documentation is lacking at best

(http://clojure.org/data_structures#Data%20Structures‐Numbers).

Consistency is great. Clojure is not great at consistency though and sometimes it feels

like the "the principle of least astonishment" is being pro‐actively broken by Clojure'sdesign in the numbers domain.

Here's an incomplete and perhaps redundant list of things that I find annoying,

surprising or outright stupid in Clojure:

Page 4: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 4/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 4/8

Arithmetic overflows everywhere! Multiplying java.lang.Integer will never

cause overflow, however java.lang.Long will fail to be autopromoted. To be

fair, this behavior is right there in the docstring for * but then again, who reads

docstsring for multiplication? There's also *' , +' and ‐' , all of which auto‐

promote the result, but what are the chances you ever even knew about them?

clojure.lang.Ratio uses java.math.BigInteger and not clojure.lang.BigInt

for numerator and denominator. Why? Because when Ratio was created (back in

2010) clojure.lang.BigInt simply didn't exist and when it was finally created,

Ratio was not updated to represent the change. Bonus points for figuring out

why clojure.lang.BigInt was created in the first place.

Floats and doubles are... Well, the same floats and doubles as in Java. There's no

attempt to hide them away. So, things like infinity and NaN are there, but they're

not really supported by Clojure. How does one check if the number is NaN or

Infinity in Clojure? You use java.lang.Float or java.lang.Double classes for

that, specifically static methods such as isNaN , isFinite , etc. Hardly a portable

solution.

Documentation is bad. Like, terribad. We're talking about a language with 8 years

of development history, with strong backing from commercial companies, with

successful commercial and open‐source products written in the languge and yet

we see very little focus on documenting things, even essential things, numbers

being one of them.

Unsigned math is not supported. There's nothing in Java, thus there's nothing in

Clojure. Make what you want out of it.

Bit operations do not belong in core namespace. It's clutter, most programs don't

need them. More than that they're simply broken. More on that in a few bits.

*unchecked‐math* is one big can of worms and can quite literally screw up your

library performance or even behavior when someone using your library sets said

dynamic var.

So, bit operations. Clojure really lets you down here and you as a programmer would

have to extremely careful to avoid the common pitfalls. Most recovering C and C++

addicts would say that bit shift to the left by one bit is equal to multiplying by 2.

Clojure says NO. Unless you multiply by a different kind of two:

user> (bit‐shift‐left Long/MAX_VALUE 1)

‐2

user> (* 2 Long/MAX_VALUE)

ArithmeticException integer overflow clojure.lang.Numbers.throwIntOverflow (Numbers.java:1501)

user> (* 2N Long/MAX_VALUE)

18446744073709551614N

The first behavior is a result of lacking proper unsigned, modular number type. The

exception in the second is the result of "protecting" the users from overflowing,

instead of promoting the type (as expected). And then the third one does the right

Page 5: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 5/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 5/8

thing. Or maybe a wrong thing, but in any case you would expect all 3 functions to do

the same thing. What's worse is that there are plenty similar examples. Predictability is

important, people!

I wanna look tan

Even though no‐one asked me, I'll try to imagine a better world of Clojure math. Firstoff, the number types. There should be only two ways to represent numbers in Clojure:

integers and reals. Integers should be signed and unbounded. Integer division always

produces reals WITHOUT EXCEPTIONS. Integers can be promoted to reals, but reals can

never be demoted to integers. Reals can follow the same approach as

java.lang.BigDecimal

(http://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html), Python's

decimal (https://docs.python.org/2/library/decimal.html) module or MPFR

(http://www.mpfr.org/mpfr‐current/mpfr.html). Now, obviously you'll immediatelyfind a problem with this approach, namely that you need a proper context for all

decimal operations. I say, default to large, and I mean LARGE precision. As in, precision

that doesn't even make sense anymore, like 2^20. Let people control the precision

through a context. Leave only basic math operations in core namespace: addition,

subtraction, multiplication and division. Define those operations clearly, make sure

that division always produces reals and truncate where need be.

Then, introduces math namespace. Put modular math operations in math.modular .

math.binary for binary math, bit shifting. math.real containing functions andmacroses helping with handling real context, rounding, etc. math.ratio for, well

Ratio. math.float for IEEE 754‐2008 floating point numbers. math.platform.jvm and

math.platform.js for exposing platform‐specific numbers.

But most importantly, write documentation. Everything has to be documented

extensively and clearly, without exceptions. Great code and great design is only half

the battle, clear documentation is the other.

As far as negative impact of said change, I can only think of performance. But only asmall minority of Clojure users type‐hints everything or uses Zachary Tellman's

primitive‐math (https://github.com/ztellman/primitive‐math). Everyone else? They

get to enjoy the math setup that has very questionable decision baked into it without

worrying much about the performance.

Let me take a selfieI tend to complain. A lot. The math in Clojure is just one of my complaint targets.

However, it's a valid target. The math is neglected in Clojure, I see no attention being

paid to it by core developers, there's no organized effort to make it better, there has

Page 6: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 6/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 6/8

been zero calls to community to ask for improvement ideas. And what's worse is that

this math is completely ingrained into Clojure core namespace and you can't replace it

easily. There's no way to fix the numbers in Clojure from the outside.

I could take on this, spend plenty of time writing the proposal pushing it to Clojure

core, writing the code afterwards, push for solution, but what are the chances that it's

ever going to get accepted? The upfront cost of this work is tremendous and there's

very little chance that such work would ever end up in Clojure core.

1. Where "literally forever" is used in terms of Internet age.

6 Comments 1

Alex Miller •

I think there are some valid points in here and tickets or patches would be welcome for them. We

have fixed a number of math-related issues over the last few releases.

It's important to understand that there is a design at work here, and that design is layered over

Java's (different) design. Inevitably, you will see some points of disconnect. I don't feel that you

captured the intent of Clojure's design in this post though. Numerics are inherently tricky in the

tension between fixed precision (for speed) and arbitrary precision (for flexibility and accuracy). Javahas an additional level of complexity due to having both primitive and "boxed" Object versions of the

primitives.

Clojure represents numbers as one of: long (64-bit two's complement signed integer), double

(double-precision 64-bit IEEE 754 floating point), biginteger (arbitrary-precision integers), bigdecimal

(arbitrary-precision floating point), and ratios. Ratios represent a ratio between (big)integers - the

arbitrary precision type is used to represent ratios that could not otherwise be represented.

Clo jure uses "contagion" when computing arithmetic operations between a mixture of types and will

produce a result with the greater precision.

When performing arithmetic on fixed precision numbers and thinking about overflow/underflow, you

have three options: 1) ignore and overflow/underflow (you can get this behavior in Clojure with the

"unchecked" family of operations - these are fast and at times useful for things like hash

computations), 2) check and throw exceptions (the default - fast but safe), and 3) check and

autopromote (you can get this behavior with the '-suffix ops - +', -', etc). All of these have utility

depending on what you are doing. Clojure chooses to be fast and safe by default.

The article mentions java.lang.Integer and java.lang.Float but these are never used in Clojure code.

Conversion support is provided (to/from Long and Double respectively) for interop with Java apis as

needed, but these are not valid types as far as Clojure is concerned.

Clojure's BigInt class is different from java.math.BigInteger in that it retains long semantics while in

long range but takes on BigInteger semantics when past long range - this is a performance

o timization as well as a hook for Clo ure-s ecific hash values and e ualit com arisons.

Page 7: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 7/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 7/8

• •

, .

Balancing all of the tradeoffs between the already complicated Java landscape and Clojure's view of

numerics along with a variety of usages with different performance needs is challenging. A variety of

options were extensively considered before reaching the current design, which provides reasonable

performance for most common operations and the possibility to be more specific when you have

different needs.

Some other useful links in the docs:

http://clojure.org/data_struct...

http://clojure.org/java_intero...

Many of the things you present as problems seem like a misunderstanding of the design and

tradeoffs to me. Others are possibly areas for improvements and we'd be happy to discuss them

(from a problem-solving perspective) on the mailing lists.

• •

mishok13 • Mod

Alex, first of all I appreciate you taking the time to go through this rant. It is, as I mentioned in

the beginning, an angry post mostly because I got bitten in the ass by most of these corner

cases (you prefer the name "design decisions" :) ) presented here.

It is my strong opinion is that these decisions are questionable at best and harmful at worst.

A lot of examples provided in the blog post are watered down extracts from actual code and

are worked around by having type-checking conditions around them. As far as the

performance argument of having this zoo of types (including primitive ones) I think it's

flawed. Clojure is not a performance beast and anyone needing it to be fast is better off with

Zachary Tellman's libraries like primitive-math or immutable-bit-set. To extend my thought

further: I think it's the responsibility of the compiler to make things performant, it's the

responsibility of language designer to make non-leaking abstractions and yet provide tools to

break through abstraction layers.

As for the documentation links you provide all I can do is quote the post itself "Existing

documentation is lacking at best.". I don't think that a page of text briefly covering numbers is

simply not enough. And trying to find motivation behind a lot of the decisions taken in Clojure

is a daunting task, since most code is just that, code without any explanation as to *why*

things work the way they do.

• •

mishok13 • Mod

One nitpick from your response that I think is pretty funny: you shouldn't claim that

ava.lang.Integer and java.lang.Float are not used in the Clojure and then link to Java interop

page that explicitly uses "int". :)

Mike •

It's a bit unfair to say that there has been zero interest in improvement ideas. Lots of people have

worked on this, and I think Alex has done a good job of explaining the design and some of the trade-offs involved.

In particular I believe it is impossible for single approach to achieve both of the following:

a) High performance (as you get with Java primitive maths operations)

Page 8: Clojure, Numbers, Despair

7/21/2019 Clojure, Numbers, Despair

http://slidepdf.com/reader/full/clojure-numbers-despair 8/8

31/10/2015 Clojure, numbers, despair

http://blog.mishkovskyi.net/posts/2015/Oct/29/clojure-numbers-despair 8/8

• •

genera purpose numer ca ower suppor ng ar rary prec s on an mu p e represen a ons

Clojure actually provides both, but you need to use different functions (unchecked primitive

operations vs. the auto-promoting function variants)

One other thing you should consider is that for serious numerical work, you won't want to use the

standard functions anyway: it is much more likely that you will want to use core.matrix backed with

an implementation that supports the specific types you need (e.g. vectorz-clj for N-dimensional

double-precision primitive arrays). That will be much faster and more idiomatic than attempting toroll your own numerical code using the standard Clojure numerical operations.

• •

mishok13 • Mod

Mike, thanks for taking the time to write a response. I don't think you understand the design I

have laid out in the post. I'm not arguing for a single approach to everything, I'm arguing that

a "general purpose numerical tower" should be simplified, stripped of or rather abstracted

from needless Java artifacts and be documented as extensively as possible. Beyond that

Clojure can provide access to platform-specific, performant types and routines operating

with said types. As you said, Clojure provides both but they're mixed between each other

leading to a lot of confusion.

And your point about serious numerical work only enforces my opinion that Clojure should

hide platform specific types, bit shifting functions and lots of other ops from core language. If

you are not going to use them for serious numerical work, why have them clutter the core

namespace?

Mike •

I understand your proposal, I just think there are quite a few problematic aspects:

a) It will break backwards compatibility, so virtually zero chance of adoption at this

point given Clojure/core propensities ;-)

b) It will mean people being forced to use different operations like modular/+ and

real/+ based on the types of arguments. That will make code become ugly very

fast.... and how do you handle the very common case of mixed arguments?

c) Platform-specific namespaces seem like a code smell to me. For core numerical

functionality, I'd expect the names to be cross-platform compatible (even if the

underlying implementation may differ)

d) Ultimately, numerical operations are pretty core to any language. I think most

people would expect them to be in clojure.core, and "just work". The exact form of

these standard operations could certainly be debated, but I think the current ones are

at least a reasonable compromise between performance and generality.

I guess someone could implement these new namespaces as a library. It would

probably be pretty useful for people feeling the same kind of pain points. FWIW, I've

been annoyed by clojure.core numerical behaviour many times, but there is usually a

workaround so it has never become painful enough for me to want a separate library.

The call for better documentation I wholeheartedl su ort of course :-