Search Results: "jgg"

31 October 2021

Joachim Breitner: A mostly allocation-free optional type

The Motoko programming language has a built-in data type for optional values, named ?t with values null and ?v (for v : t); this is the equivalent of Haskell s Maybe, Ocaml s option or Rust s Option. In this post, I explain how Motoko represents such optional values (almost) without allocation. I neither claim nor expect that any of this is novel; I just hope it s interesting.

Uniform representation The Motoko programming language, designed by Andreas Rossberg and implemented by a pretty cool team at DFINITY is a high-level language with strict semantics and a strong, structural, equi-recursive type system that compiles down to WebAssembly. Because the type system supports polymorphism, it represents all values uniformly. Simplified for the purpose of this blog post, they are always pointers into a heap object where the first word of the heap object, the heap tag, contains information about the value:
 
  tag    
 
The tag is something like array, int64, blob, variant, record, , and it has two purposes:
  • The garbage collector uses it to understand what kind of object it is looking at, so that it can move it around and follow pointers therein. Variable size objects such as arrays include the object size in a subsequent word of the heap object.
  • Some types have values that may have different shapes on the heap. For example, the ropes used in our text representation can either be a plain blob, or a concatenation node of two blobs. For these types, the tag of the heap object is inspected.

The optional type, naively The optional type (?t) is another example of such a type: Its values can either be null, or ?v for some value v of type t, and the primitive operations on this type are the two introduction forms, an analysis function, and a projection for non-null values:
null : () -> ?t
some : t -> ?t
is_some : ?t -> bool
project : ?t -> t     // must only be used if is_some holds
It is natural to use the heap tag to distinguish the two kind of values:
  • The null value is a simple one-word heap object with just a tag that says that this is null:
     
      null  
     
  • The other values are represented by a two-word object: The tag some, indicating that it is a ?v, and then the payload, which is the pointer that represents the value v:
     
      some   payload  
     
With this, the four operations can be implemented as follows:
def null():
  ptr <- alloc(1)
  ptr[0] = NULL_TAG
  return ptr
def some(v):
  ptr <- alloc(2)
  ptr[0] = SOME_TAG
  ptr[1] = v
  return ptr
def is_some(p):
  return p[0] == SOME_TAG
def project(p):
  return p[1]
The problem with this implementation is that null() and some(v) both allocate memory. This is bad if they are used very often, and very liberally, and this is the case in Motoko: For example the iterators used for the for (x in e) construct have type
type Iter<T> =   next : () -> ?T  
and would unavoidably allocate a few words on each iteration. Can we avoid this?

Static values It is quite simple to avoid this for for null: Just statically create a single null value and use it every time:
static_null = [NULL_TAG]
def null():
  return static_null
This way, at least null() doesn t allocate. But we gain more: Now every null value is represented by the same pointer, and since the pointer points to static memory, it does not change even with a moving garbage collector, so we can speed up is_some:
def is_some(p):
  return p != static_null
This is not a very surprising change so far, and most compilers out there can and will do at least the static allocation of such singleton constructors. For example, in Haskell, there is only a single empty list ([]) and a single Nothing value in your program, as you can see in my videos exploring the Haskell heap. But can we get rid of the allocation in some(v) too?

Unwrapped optional values If we don t want to allocate in some(v), what can we do? How about simply
def some(v):
  return v
That does not allocate! But it is also broken. At type ??Int, the values null, ?null and ??null are distinct values, but here this breaks. Or, more formally, the following laws should hold for our four primitive operations:
  1. is_some(null()) = false
  2. v. is_some(some(v)) = true
  3. p. project(some(p)) = p
But with the new definition of some, we d get is_some(some(null())) = false. Not good! But note that we only have a problem if we are wrapping a value that is null or some(v). So maybe take the shortcut only then, and write the following:
def some(v):
  if v == static_null   v[0] == SOME_TAG:
    ptr <- alloc(2)
    ptr[0] = SOME_TAG
    ptr[1] = v
    return ptr
  else:
    return v
The definition of is_some can stay as it is: It is still the case that all null values are represented by static_null. But the some values are now no longer all of the same shape, so we have to change project():
def project(p):
  if p[0] == SOME_TAG:
    return p[1]
  else:
    return p
Now things fall into place: A value ?v can, in many cases, be represented the same way as v, and no allocation is needed. Only when v is null or ?null or ??null or ???null etc. we need to use the some heap object, and thus have to allocate. In fact, it doesn t cost much to avoid allocation even for ?null:
static_some_null = [SOME_TAG, static_null]
def some(v):
  if v == static_null:
    return static_some_null
  else if v[0] == SOME_TAG:
    ptr <- alloc(2)
    ptr[0] = SOME_TAG
    ptr[1] = v
    return ptr
  else:
    return v
So unless one nests the ? type two levels deep, there is no allocation whatsoever, and the only cost is a bit more branching in some and project. That wasn t hard, but quite rewarding, as one can now use idioms like the iterator shown above with greater ease.

Examples The following tables shows the representation of various values before and after. Here [ ] is a pointed-to dynamically allocated heap object, a statically allocated heap object, N = NULL_TAG and S = SOME_TAG.
type value before after
Null null N N
?Int null N N
?Int ?23 [S,23] 23
??Int null N N
??Int ?null [S, N ] S, N
??Int ??23 [S,[S,23]] 23
???Int null N N
???Int ?null [S, N ] S, N
???Int ??null [S,[S, N ]] [S, S, N ]
???Int ???23 [S,[S,[S,23]]] 23

Concluding remarks
  • If you know what parametric polymorphism is, and wonder how this encoding can work in a language that has that, note that this representation is on the low-level of the untyped run-time value representation: We don t need to know the type of v in some(v), merely its heap representation.
  • The uniform representation in Motoko is a bit more involved: The pointers are tagged (by subtracting 1) and some scalar values are represented directly in place (shifted left by 1 bit). But this is luckily orthogonal to what I showed here.
  • Could Haskell do this for its Maybe type? Not so easily:
    • The Maybe type is not built-in, but rather a standard library-defined algebraic data type. But the compiler could feasible detect that this is option-like?
    • Haskell is lazy, so at runtime, the type Maybe could be Nothing, or Just v, or, and this is crucial, a yet to be evaluated expression, also called a thunk. And one definitely needs to distinguish between a thunk t :: Maybe a that may evaluate to Nothing, and a value Just t :: Maybe a that definitely is Just, but contains a value, which may be a thunk.
    Something like this may work for a strict Maybe type or unlifted sums like (# (# #) a #), but may clash with other ticks in GHC, such as pointer tagging.
  • As I said above, I don t expect this to be novel, and I am happy to add references to prior art here.
  • Given that a heap object with tag SOME_TAG now always encodes a tower ? null for n>0, one could try to optimize that even more by just storing the n:
     
      some    n   
     
    But that seems unadvisable: It is only a win if you have deep towers, which is probably rare. Worse, now the project function would have to return such a heap object with n decremented, so now projection might have to allocate, which goes against the cost model expected by the programmer.
  • If you rather want to see code than blog posts, feel free to check out Motoko PR #2115.
  • Does this kind of stuff excite you? Motoko is open source, so your contributions may be welcome!

19 June 2021

Joachim Breitner: Leaving DFINITY

Last Thursday was my last working day at DFINITY. There are various reasons why I felt that after almost three years the DFINITY Foundation isn t quite the right place for me anymore, and this plan has been in the making for a while. Primarily, there are personal pull factors that strongly suggest that I ll take a break from full time employment, so I decided to see the launch of the Internet Computer through and then leave. DFINITY has hired some amazing people, and it was a great pleasure to work with them. I learned a lot (some Rust, a lot of Nix, and just how merciless Conway s law is), and I dare say I had the opportunity to do some good work, contributing my part to make the Internet Computer a reality. I am especially proud of the Interface Specification and the specification-driven design principles behind it. It even comes with a model reference implementation and acceptance test suite, and although we didn t quite get to do formalization, those familiar with the DeepSpec project will recognize some influence of their concept of deep specifications . Besides that, there is of course my work on the Motoko programming language, where I build the backend,a the Candid interoperability layer, where I helped with the formalism, formulated the a generic soundness criterion for Interface Description Languages in a higher order settings and formally verified that in Coq. Fortunately, all of this work is now Free Software or at least Open Source. With so much work poured into this project, I continue to care about it, and you ll see me post on the the developer forum and hack on Motoko. As the Internet Computer becomes gradually more open, I hope I can be gradually more involved again. But even without me contributing full-time I am sure that DFINITY and the Internet Computer will do well; when I left there were still plenty of smart, capable and enthusiastic people forging ahead. So what s next? So far, I have rushed every professional transition in my life: When starting my PhD, when starting my postdoc, when starting my job at DFINITY, and every time I regretted it. So this time, I will take a proper break and will explore the world a bit (as far as that is possible given the pandemic). I will stresslessly contribute to various open source projects. I also hope to do more public outreach and teaching, writing more blog posts again, recording screencasts and giving talks and lectures. If you want to invite me to your user group/seminar/company/workshop, please let me know! Also, I might be up for small interesting projects in a while. Beyond these, I have no concrete plans and am looking forward to the inspiration I might get from hiking through the Scandinavian wilderness. If you happen to stumble across my tent, please stop for a tea.

4 May 2021

Erich Schubert: Machine Learning Lecture Recordings

I have uploaded most of my Machine Learning lecture to YouTube. The slides are in English, but the audio is in German. Some very basic contents (e.g., a demo of standard k-means clustering) were left out from this advanced class, and instead only a link to recordings from an earlier class were given. In this class, I wanted to focus on the improved (accelerated) algorithms instead. These are not included here (yet). I believe there are some contents covered in this class you will find nowhere else (yet). The first unit is pretty long (I did not split it further yet). The later units are shorter recordings. ML F1: Principles in Machine Learning ML F2/F3: Correlation does not Imply Causation & Multiple Testing Problem ML F4: Overfitting beranpassung ML F5: Fluch der Dimensionalit t Curse of Dimensionality ML F6: Intrinsische Dimensionalit t Intrinsic Dimensionality ML F7: Distanzfunktionen und hnlichkeitsfunktionen ML L1: Einf hrung in die Klassifikation ML L2: Evaluation und Wahl von Klassifikatoren ML L3: Bayes-Klassifikatoren ML L4: N chste-Nachbarn Klassifikation ML L5: N chste Nachbarn und Kerndichtesch tzung ML L6: Lernen von Entscheidungsb umen ML L7: Splitkriterien bei Entscheidungsb umen ML L8: Ensembles und Meta-Learning: Random Forests und Gradient Boosting ML L9: Support Vector Machinen - Motivation ML L10: Affine Hyperebenen und Skalarprodukte Geometrie f r SVMs ML L11: Maximum Margin Hyperplane die breitest m gliche Stra e ML L12: Training Support Vector Machines ML L13: Non-linear SVM and the Kernel Trick ML L14: SVM Extensions and Conclusions ML L15: Motivation of Neural Networks ML L16: Threshold Logic Units ML L17: General Artificial Neural Networks ML L18: Learning Neural Networks with Backpropagation ML L19: Deep Neural Networks ML L20: Convolutional Neural Networks ML L21: Recurrent Neural Networks and LSTM ML L22: Conclusion Classification ML U1: Einleitung Clusteranalyse ML U2: Hierarchisches Clustering ML U3: Accelerating HAC mit Anderberg s Algorithmus ML U4: k-Means Clustering ML U5: Accelerating k-Means Clustering ML U6: Limitations of k-Means Clustering ML U7: Extensions of k-Means Clustering ML U8: Partitioning Around Medoids (k-Medoids) ML U9: Gaussian Mixture Modeling (EM Clustering) ML U10: Gaussian Mixture Modeling Demo ML U11: BIRCH and BETULA Clustering ML U12: Motivation Density-Based Clustering (DBSCAN) ML U13: Density-reachable and density-connected (DBSCAN Clustering) ML U14: DBSCAN Clustering ML U15: Parameterization of DBSCAN ML U16: Extensions and Variations of DBSCAN Clustering ML U17: OPTICS Clustering ML U18: Cluster Extraction from OPTICS Plots ML U19: Understanding the OPTICS Cluster Order ML U20: Spectral Clustering ML U21: Biclustering and Subspace Clustering ML U22: Further Clustering Approaches

17 May 2020

Matthew Palmer: Private Key Redaction: UR DOIN IT RONG

Because posting private keys on the Internet is a bad idea, some people like to redact their private keys, so that it looks kinda-sorta like a private key, but it isn t actually giving away anything secret. Unfortunately, due to the way that private keys are represented, it is easy to redact a key in such a way that it doesn t actually redact anything at all. RSA private keys are particularly bad at this, but the problem can (potentially) apply to other keys as well. I ll show you a bit of Inside Baseball with key formats, and then demonstrate the practical implications. Finally, we ll go through a practical worked example from an actual not-really-redacted key I recently stumbled across in my travels.

The Private Lives of Private Keys Here is what a typical private key looks like, when you come across it:
-----BEGIN RSA PRIVATE KEY-----
MGICAQACEQCxjdTmecltJEz2PLMpS4BXAgMBAAECEDKtuwD17gpagnASq1zQTYEC
CQDVTYVsjjF7IQIJANUYZsIjRsR3AgkAkahDUXL0RSECCB78r2SnsJC9AghaOK3F
sKoELg==
-----END RSA PRIVATE KEY-----
Obviously, there s some hidden meaning in there computers don t encrypt things by shouting BEGIN RSA PRIVATE KEY! , after all. What is between the BEGIN/END lines above is, in fact, a base64-encoded DER format ASN.1 structure representing a PKCS#1 private key. In simple terms, it s a list of numbers very important numbers. The list of numbers is, in order:
  • A version number (0);
  • The public modulus , commonly referred to as n ;
  • The public exponent , or e (which is almost always 65,537, for various unimportant reasons);
  • The private exponent , or d ;
  • The two private primes , or p and q ;
  • Two exponents, which are known as dmp1 and dmq1 ; and
  • A coefficient, known as iqmp .

Why Is This a Problem? The thing is, only three of those numbers are actually required in a private key. The rest, whilst useful to allow the RSA encryption and decryption to be more efficient, aren t necessary. The three absolutely required values are e, p, and q. Of the other numbers, most of them are at least about the same size as each of p and q. So of the total data in an RSA key, less than a quarter of the data is required. Let me show you with the above toy key, by breaking it down piece by piece1:
  • MGI DER for this is a sequence
  • CAQ version (0)
  • CxjdTmecltJEz2PLMpS4BX n
  • AgMBAA e
  • ECEDKtuwD17gpagnASq1zQTY d
  • ECCQDVTYVsjjF7IQ p
  • IJANUYZsIjRsR3 q
  • AgkAkahDUXL0RS dmp1
  • ECCB78r2SnsJC9 dmq1
  • AghaOK3FsKoELg== iqmp
Remember that in order to reconstruct all of these values, all I need are e, p, and q and e is pretty much always 65,537. So I could redact almost all of this key, and still give all the important, private bits of this key. Let me show you:
-----BEGIN RSA PRIVATE KEY-----
..............................................................EC
CQDVTYVsjjF7IQIJANUYZsIjRsR3....................................
........
-----END RSA PRIVATE KEY-----
Now, I doubt that anyone is going to redact a key precisely like this but then again, this isn t a typical RSA key. They usually look a lot more like this:
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEAu6Inch7+mWtKn+leB9uCG3MaJIxRyvC/5KTz2fR+h+GOhqj4
SZJobiVB4FrE5FgC7AnlH6qeRi9MI0s6dt5UWZ5oNIeWSaOOeNO+EJDUkSVf67wj
SNGXlSjGAkPZ0nRJiDjhuPvQmdW53hOaBLk5udxPEQbenpXAzbLJ7wH5ouLQ3nQw
HwpwDNQhF6zRO8WoscpDVThOAM+s4PS7EiK8ZR4hu2toon8Ynadlm95V45wR0VlW
zywgbkZCKa1IMrDCscB6CglQ10M3Xzya3iTzDtQxYMVqhDrA7uBYRxA0y1sER+Rb
yhEh03xz3AWemJVLCQuU06r+FABXJuY/QuAVvQIDAQABAoIBAFqwWVhzWqNUlFEO
PoCVvCEAVRZtK+tmyZj9kU87ORz8DCNR8A+/T/JM17ZUqO2lDGSBs9jGYpGRsr8s
USm69BIM2ljpX95fyzDjRu5C0jsFUYNi/7rmctmJR4s4uENcKV5J/++k5oI0Jw4L
c1ntHNWUgjK8m0UTJIlHbQq0bbAoFEcfdZxd3W+SzRG3jND3gifqKxBG04YDwloy
tu+bPV2jEih6p8tykew5OJwtJ3XsSZnqJMwcvDciVbwYNiJ6pUvGq6Z9kumOavm9
XU26m4cWipuK0URWbHWQA7SjbktqEpxsFrn5bYhJ9qXgLUh/I1+WhB2GEf3hQF5A
pDTN4oECgYEA7Kp6lE7ugFBDC09sKAhoQWrVSiFpZG4Z1gsL9z5YmZU/vZf0Su0n
9J2/k5B1GghvSwkTqpDZLXgNz8eIX0WCsS1xpzOuORSNvS1DWuzyATIG2cExuRiB
jYWIJUeCpa5p2PdlZmBrnD/hJ4oNk4oAVpf+HisfDSN7HBpN+TJfcAUCgYEAyvY7
Y4hQfHIdcfF3A9eeCGazIYbwVyfoGu70S/BZb2NoNEPymqsz7NOfwZQkL4O7R3Wl
Rm0vrWT8T5ykEUgT+2ruZVXYSQCKUOl18acbAy0eZ81wGBljZc9VWBrP1rHviVWd
OVDRZNjz6nd6ZMrJvxRa24TvxZbJMmO1cgSW1FkCgYAoWBd1WM9HiGclcnCZknVT
UYbykCeLO0mkN1Xe2/32kH7BLzox26PIC2wxF5seyPlP7Ugw92hOW/zewsD4nLze
v0R0oFa+3EYdTa4BvgqzMXgBfvGfABJ1saG32SzoWYcpuWLLxPwTMsCLIPmXgRr1
qAtl0SwF7Vp7O/C23mNukQKBgB89DOEB7xloWv3Zo27U9f7nB7UmVsGjY8cZdkJl
6O4LB9PbjXCe3ywZWmJqEbO6e83A3sJbNdZjT65VNq9uP50X1T+FmfeKfL99X2jl
RnQTsrVZWmJrLfBSnBkmb0zlMDAcHEnhFYmHFuvEnfL7f1fIoz9cU6c+0RLPY/L7
n9dpAoGAXih17mcmtnV+Ce+lBWzGWw9P4kVDSIxzGxd8gprrGKLa3Q9VuOrLdt58
++UzNUaBN6VYAe4jgxGfZfh+IaSlMouwOjDgE/qzgY8QsjBubzmABR/KWCYiRqkj
qpWCgo1FC1Gn94gh/+dW2Q8+NjYtXWNqQcjRP4AKTBnPktEvdMA=
-----END RSA PRIVATE KEY-----
People typically redact keys by deleting whole lines, and usually replacing them with [...] and the like. But only about 345 of those 1588 characters (excluding the header and footer) are required to construct the entire key. You can redact about 4/5ths of that giant blob of stuff, and your private parts (or at least, those of your key) are still left uncomfortably exposed.

But Wait! There s More! Remember how I said that everything in the key other than e, p, and q could be derived from those three numbers? Let s talk about one of those numbers: n. This is known as the public modulus (because, along with e, it is also present in the public key). It is very easy to calculate: n = p * q. It is also very early in the key (the second number, in fact). Since n = p * q, it follows that q = n / p. Thus, as long as the key is intact up to p, you can derive q by simple division.

Real World Redaction At this point, I d like to introduce an acquaintance of mine: Mr. Johan Finn. He is the proud owner of the GitHub repo johanfinn/scripts. For a while, his repo contained a script that contained a poorly-redacted private key. He since deleted it, by making a new commit, but of course because git never really deletes anything, it s still available. Of course, Mr. Finn may delete the repo, or force-push a new history without that commit, so here is the redacted private key, with a bit of the surrounding shell script, for our illustrative pleasure:
#Add private key to .ssh folder
cd /home/johan/.ssh/
echo  "-----BEGIN RSA PRIVATE KEY-----
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK
 
MIIJKgIBAAKCAgEAxEVih1JGb8gu/Fm4AZh+ZwJw/pjzzliWrg4mICFt1g7SmIE2
TCQMKABdwd11wOFKCPc/UzRH/fHuQcvWrpbOSdqev/zKff9iedKw/YygkMeIRaXB
fYELqvUAOJ8PPfDm70st9GJRhjGgo5+L3cJB2gfgeiDNHzaFvapRSU0oMGQX+kI9
ezsjDAn+0Pp+r3h/u1QpLSH4moRFGF4omNydI+3iTGB98/EzuNhRBHRNq4oBV5SG
Pq/A1bem2ninnoEaQ+OPESxYzDz3Jy9jV0W/6LvtJ844m+XX69H5fqq5dy55z6DW
sGKn78ULPVZPsYH5Y7C+CM6GAn4nYCpau0t52sqsY5epXdeYx4Dc+Wm0CjXrUDEe
Egl4loPKDxJkQqQ/MQiz6Le/UK9vEmnWn1TRXK3ekzNV4NgDfJANBQobOpwt8WVB
rbsC0ON7n680RQnl7PltK9P1AQW5vHsahkoixk/BhcwhkrkZGyDIl9g8Q/Euyoq3
eivKPLz7/rhDE7C1BzFy7v8AjC3w7i9QeHcWOZFAXo5hiDasIAkljDOsdfD4tP5/
wSO6E6pjL3kJ+RH2FCHd7ciQb+IcuXbku64ln8gab4p8jLa/mcMI+V3eWYnZ82Yu
axsa85hAe4wb60cp/rCJo7ihhDTTvGooqtTisOv2nSvCYpcW9qbL6cGjAXECAwEA
AQKCAgEAjz6wnWDP5Y9ts2FrqUZ5ooamnzpUXlpLhrbu3m5ncl4ZF5LfH+QDN0Kl
KvONmHsUhJynC/vROybSJBU4Fu4bms1DJY3C39h/L7g00qhLG7901pgWMpn3QQtU
4P49qpBii20MGhuTsmQQALtV4kB/vTgYfinoawpo67cdYmk8lqzGzzB/HKxZdNTq
s+zOfxRr7PWMo9LyVRuKLjGyYXZJ/coFaobWBi8Y96Rw5NZZRYQQXLIalC/Dhndm
AHckpstEtx2i8f6yxEUOgPvV/gD7Akn92RpqOGW0g/kYpXjGqZQy9PVHGy61sInY
HSkcOspIkJiS6WyJY9JcvJPM6ns4b84GE9qoUlWVF3RWJk1dqYCw5hz4U8LFyxsF
R6WhYiImvjxBLpab55rSqbGkzjI2z+ucDZyl1gqIv9U6qceVsgRyuqdfVN4deU22
LzO5IEDhnGdFqg9KQY7u8zm686Ejs64T1sh0y4GOmGsSg+P6nsqkdlXH8C+Cf03F
lqPFg8WQC7ojl/S8dPmkT5tcJh3BPwIWuvbtVjFOGQc8x0lb+NwK8h2Nsn6LNazS
0H90adh/IyYX4sBMokrpxAi+gMAWiyJHIHLeH2itNKtAQd3qQowbrWNswJSgJzsT
JuJ7uqRKAFkE6nCeAkuj/6KHHMPsfCAffVdyGaWqhoxmPOrnVgECggEBAOrCCwiC
XxwUgjOfOKx68siFJLfHf4vPo42LZOkAQq5aUmcWHbJVXmoxLYSczyAROopY0wd6
Dx8rqnpO7OtZsdJMeBSHbMVKoBZ77hiCQlrljcj12moFaEAButLCdZFsZW4zF/sx
kWIAaPH9vc4MvHHyvyNoB3yQRdevu57X7xGf9UxWuPil/jvdbt9toaraUT6rUBWU
GYPNKaLFsQzKsFWAzp5RGpASkhuiBJ0Qx3cfLyirjrKqTipe3o3gh/5RSHQ6VAhz
gdUG7WszNWk8FDCL6RTWzPOrbUyJo/wz1kblsL3vhV7ldEKFHeEjsDGroW2VUFlS
asAHNvM4/uYcOSECggEBANYH0427qZtLVuL97htXW9kCAT75xbMwgRskAH4nJDlZ
IggDErmzBhtrHgR+9X09iL47jr7dUcrVNPHzK/WXALFSKzXhkG/yAgmt3r14WgJ6
5y7010LlPFrzaNEyO/S4ISuBLt4cinjJsrFpoo0WI8jXeM5ddG6ncxdurKXMymY7
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::.::
:::::::::::::::::::::::::::.::::::::::::::::::::::::::::::::::::
LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLlL
 
 
 
YYYYYYYYYYYYYYYYYYYYYyYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
gff0GJCOMZ65pMSy3A3cSAtjlKnb4fWzuHD5CFbusN4WhCT/tNxGNSpzvxd8GIDs
nY7exs9L230oCCpedVgcbayHCbkChEfoPzL1e1jXjgCwCTgt8GjeEFqc1gXNEaUn
O8AJ4VlR8fRszHm6yR0ZUBdY7UJddxQiYOzt0S1RLlECggEAbdcs4mZdqf3OjejJ
06oTPs9NRtAJVZlppSi7pmmAyaNpOuKWMoLPElDAQ3Q7VX26LlExLCZoPOVpdqDH
KbdmBEfTR4e11Pn9vYdu9/i6o10U4hpmf4TYKlqk10g1Sj21l8JATj/7Diey8scO
sAI1iftSg3aBSj8W7rxCxSezrENzuqw5D95a/he1cMUTB6XuravqZK5O4eR0vrxR
AvMzXk5OXrUEALUvt84u6m6XZZ0pq5XZxq74s8p/x1JvTwcpJ3jDKNEixlHfdHEZ
ZIu/xpcwD5gRfVGQamdcWvzGHZYLBFO1y5kAtL8kI9tW7WaouWVLmv99AyxdAaCB
Y5mBAQKCAQEAzU7AnorPzYndlOzkxRFtp6MGsvRBsvvqPLCyUFEXrHNV872O7tdO
GmsMZl+q+TJXw7O54FjJJvqSSS1sk68AGRirHop7VQce8U36BmI2ZX6j2SVAgIkI
9m3btCCt5rfiCatn2+Qg6HECmrCsHw6H0RbwaXS4RZUXD/k4X+sslBitOb7K+Y+N
Bacq6QxxjlIqQdKKPs4P2PNHEAey+kEJJGEQ7bTkNxCZ21kgi1Sc5L8U/IGy0BMC
PvJxssLdaWILyp3Ws8Q4RAoC5c0ZP0W2j+5NSbi3jsDFi0Y6/2GRdY1HAZX4twem
Q0NCedq1JNatP1gsb6bcnVHFDEGsj/35oQKCAQEAgmWMuSrojR/fjJzvke6Wvbox
FRnPk+6YRzuYhAP/YPxSRYyB5at++5Q1qr7QWn7NFozFIVFFT8CBU36ktWQ39MGm
cJ5SGyN9nAbbuWA6e+/u059R7QL+6f64xHRAGyLT3gOb1G0N6h7VqFT25q5Tq0rc
Lf/CvLKoudjv+sQ5GKBPT18+zxmwJ8YUWAsXUyrqoFWY/Tvo5yLxaC0W2gh3+Ppi
EDqe4RRJ3VKuKfZxHn5VLxgtBFN96Gy0+Htm5tiMKOZMYAkHiL+vrVZAX0hIEuRZ
JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
-----END RSA PRIVATE KEY-----" >> id_rsa
Now, if you try to reconstruct this key by removing the obvious garbage lines (the ones that are all repeated characters, some of which aren t even valid base64 characters), it still isn t a key at least, openssl pkey doesn t want anything to do with it. The key is very much still in there, though, as we shall soon see. Using a gem I wrote and a quick bit of Ruby, we can extract a complete private key. The irb session looks something like this:
>> require "derparse"
>> b64 = <<EOF
MIIJKgIBAAKCAgEAxEVih1JGb8gu/Fm4AZh+ZwJw/pjzzliWrg4mICFt1g7SmIE2
TCQMKABdwd11wOFKCPc/UzRH/fHuQcvWrpbOSdqev/zKff9iedKw/YygkMeIRaXB
fYELqvUAOJ8PPfDm70st9GJRhjGgo5+L3cJB2gfgeiDNHzaFvapRSU0oMGQX+kI9
ezsjDAn+0Pp+r3h/u1QpLSH4moRFGF4omNydI+3iTGB98/EzuNhRBHRNq4oBV5SG
Pq/A1bem2ninnoEaQ+OPESxYzDz3Jy9jV0W/6LvtJ844m+XX69H5fqq5dy55z6DW
sGKn78ULPVZPsYH5Y7C+CM6GAn4nYCpau0t52sqsY5epXdeYx4Dc+Wm0CjXrUDEe
Egl4loPKDxJkQqQ/MQiz6Le/UK9vEmnWn1TRXK3ekzNV4NgDfJANBQobOpwt8WVB
rbsC0ON7n680RQnl7PltK9P1AQW5vHsahkoixk/BhcwhkrkZGyDIl9g8Q/Euyoq3
eivKPLz7/rhDE7C1BzFy7v8AjC3w7i9QeHcWOZFAXo5hiDasIAkljDOsdfD4tP5/
wSO6E6pjL3kJ+RH2FCHd7ciQb+IcuXbku64ln8gab4p8jLa/mcMI+V3eWYnZ82Yu
axsa85hAe4wb60cp/rCJo7ihhDTTvGooqtTisOv2nSvCYpcW9qbL6cGjAXECAwEA
AQKCAgEAjz6wnWDP5Y9ts2FrqUZ5ooamnzpUXlpLhrbu3m5ncl4ZF5LfH+QDN0Kl
KvONmHsUhJynC/vROybSJBU4Fu4bms1DJY3C39h/L7g00qhLG7901pgWMpn3QQtU
4P49qpBii20MGhuTsmQQALtV4kB/vTgYfinoawpo67cdYmk8lqzGzzB/HKxZdNTq
s+zOfxRr7PWMo9LyVRuKLjGyYXZJ/coFaobWBi8Y96Rw5NZZRYQQXLIalC/Dhndm
AHckpstEtx2i8f6yxEUOgPvV/gD7Akn92RpqOGW0g/kYpXjGqZQy9PVHGy61sInY
HSkcOspIkJiS6WyJY9JcvJPM6ns4b84GE9qoUlWVF3RWJk1dqYCw5hz4U8LFyxsF
R6WhYiImvjxBLpab55rSqbGkzjI2z+ucDZyl1gqIv9U6qceVsgRyuqdfVN4deU22
LzO5IEDhnGdFqg9KQY7u8zm686Ejs64T1sh0y4GOmGsSg+P6nsqkdlXH8C+Cf03F
lqPFg8WQC7ojl/S8dPmkT5tcJh3BPwIWuvbtVjFOGQc8x0lb+NwK8h2Nsn6LNazS
0H90adh/IyYX4sBMokrpxAi+gMAWiyJHIHLeH2itNKtAQd3qQowbrWNswJSgJzsT
JuJ7uqRKAFkE6nCeAkuj/6KHHMPsfCAffVdyGaWqhoxmPOrnVgECggEBAOrCCwiC
XxwUgjOfOKx68siFJLfHf4vPo42LZOkAQq5aUmcWHbJVXmoxLYSczyAROopY0wd6
Dx8rqnpO7OtZsdJMeBSHbMVKoBZ77hiCQlrljcj12moFaEAButLCdZFsZW4zF/sx
kWIAaPH9vc4MvHHyvyNoB3yQRdevu57X7xGf9UxWuPil/jvdbt9toaraUT6rUBWU
GYPNKaLFsQzKsFWAzp5RGpASkhuiBJ0Qx3cfLyirjrKqTipe3o3gh/5RSHQ6VAhz
gdUG7WszNWk8FDCL6RTWzPOrbUyJo/wz1kblsL3vhV7ldEKFHeEjsDGroW2VUFlS
asAHNvM4/uYcOSECggEBANYH0427qZtLVuL97htXW9kCAT75xbMwgRskAH4nJDlZ
IggDErmzBhtrHgR+9X09iL47jr7dUcrVNPHzK/WXALFSKzXhkG/yAgmt3r14WgJ6
5y7010LlPFrzaNEyO/S4ISuBLt4cinjJsrFpoo0WI8jXeM5ddG6ncxdurKXMymY7
EOF
>> b64 += <<EOF
gff0GJCOMZ65pMSy3A3cSAtjlKnb4fWzuHD5CFbusN4WhCT/tNxGNSpzvxd8GIDs
nY7exs9L230oCCpedVgcbayHCbkChEfoPzL1e1jXjgCwCTgt8GjeEFqc1gXNEaUn
O8AJ4VlR8fRszHm6yR0ZUBdY7UJddxQiYOzt0S1RLlECggEAbdcs4mZdqf3OjejJ
06oTPs9NRtAJVZlppSi7pmmAyaNpOuKWMoLPElDAQ3Q7VX26LlExLCZoPOVpdqDH
KbdmBEfTR4e11Pn9vYdu9/i6o10U4hpmf4TYKlqk10g1Sj21l8JATj/7Diey8scO
sAI1iftSg3aBSj8W7rxCxSezrENzuqw5D95a/he1cMUTB6XuravqZK5O4eR0vrxR
AvMzXk5OXrUEALUvt84u6m6XZZ0pq5XZxq74s8p/x1JvTwcpJ3jDKNEixlHfdHEZ
ZIu/xpcwD5gRfVGQamdcWvzGHZYLBFO1y5kAtL8kI9tW7WaouWVLmv99AyxdAaCB
Y5mBAQKCAQEAzU7AnorPzYndlOzkxRFtp6MGsvRBsvvqPLCyUFEXrHNV872O7tdO
GmsMZl+q+TJXw7O54FjJJvqSSS1sk68AGRirHop7VQce8U36BmI2ZX6j2SVAgIkI
9m3btCCt5rfiCatn2+Qg6HECmrCsHw6H0RbwaXS4RZUXD/k4X+sslBitOb7K+Y+N
Bacq6QxxjlIqQdKKPs4P2PNHEAey+kEJJGEQ7bTkNxCZ21kgi1Sc5L8U/IGy0BMC
PvJxssLdaWILyp3Ws8Q4RAoC5c0ZP0W2j+5NSbi3jsDFi0Y6/2GRdY1HAZX4twem
Q0NCedq1JNatP1gsb6bcnVHFDEGsj/35oQKCAQEAgmWMuSrojR/fjJzvke6Wvbox
FRnPk+6YRzuYhAP/YPxSRYyB5at++5Q1qr7QWn7NFozFIVFFT8CBU36ktWQ39MGm
cJ5SGyN9nAbbuWA6e+/u059R7QL+6f64xHRAGyLT3gOb1G0N6h7VqFT25q5Tq0rc
Lf/CvLKoudjv+sQ5GKBPT18+zxmwJ8YUWAsXUyrqoFWY/Tvo5yLxaC0W2gh3+Ppi
EDqe4RRJ3VKuKfZxHn5VLxgtBFN96Gy0+Htm5tiMKOZMYAkHiL+vrVZAX0hIEuRZ
EOF
>> der = b64.unpack("m").first
>> c = DerParse.new(der).first_node.first_child
>> version = c.value
=> 0
>> c = c.next_node
>> n = c.value
=> 80071596234464993385068908004931... # (etc)
>> c = c.next_node
>> e = c.value
=> 65537
>> c = c.next_node
>> d = c.value
=> 58438813486895877116761996105770... # (etc)
>> c = c.next_node
>> p = c.value
=> 29635449580247160226960937109864... # (etc)
>> c = c.next_node
>> q = c.value
=> 27018856595256414771163410576410... # (etc)
What I ve done, in case you don t speak Ruby, is take the two chunks of plausible-looking base64 data, chuck them together into a variable named b64, unbase64 it into a variable named der, pass that into a new DerParse instance, and then walk the DER value tree until I got all the values I need. Interestingly, the q value actually traverses the split in the two chunks, which means that there s always the possibility that there are lines missing from the key. However, since p and q are supposed to be prime, we can sanity check them to see if corruption is likely to have occurred:
>> require "openssl"
>> OpenSSL::BN.new(p).prime?
=> true
>> OpenSSL::BN.new(q).prime?
=> true
Excellent! The chances of a corrupted file producing valid-but-incorrect prime numbers isn t huge, so we can be fairly confident that we ve got the real p and q. Now, with the help of another one of my creations we can use e, p, and q to create a fully-operational battle key:
>> require "openssl/pkey/rsa"
>> k = OpenSSL::PKey::RSA.from_factors(p, q, e)
=> #<OpenSSL::PKey::RSA:0x0000559d5903cd38>
>> k.valid?
=> true
>> k.verify(OpenSSL::Digest::SHA256.new, k.sign(OpenSSL::Digest::SHA256.new, "bob"), "bob")
=> true
and there you have it. One fairly redacted-looking private key brought back to life by maths and far too much free time. Sorry Mr. Finn, I hope you re not still using that key on anything Internet-facing.

What About Other Key Types? EC keys are very different beasts, but they have much the same problems as RSA keys. A typical EC key contains both private and public data, and the public portion is twice the size so only about 1/3 of the data in the key is private material. It is quite plausible that you can redact an EC key and leave all the actually private bits exposed.

What Do We Do About It? In short: don t ever try and redact real private keys. For documentation purposes, just put KEY GOES HERE in the appropriate spot, or something like that. Store your secrets somewhere that isn t a public (or even private!) git repo. Generating a dummy private key and sticking it in there isn t a great idea, for different reasons: people have this odd habit of reusing demo keys in real life. There s no need to encourage that sort of thing.
  1. Technically the pieces aren t 100% aligned with the underlying DER, because of how base64 works. I felt it was easier to understand if I stuck to chopping up the base64, rather than decoding into DER and then chopping up the DER.

14 October 2011

John Goerzen: Greece part 2: History (and sauntering up to guys with machine guns)

Terah and I went to the Greek island Rhodes recently. This is the second in a series about it. I am one to enjoy history. There is something deeply, well, connecting, about standing in an old place. There is a timeless quality to it a feeling of being connected to so many people of the past, and yet still being connected to change, visible in things such as weathering of stones. To gaze at pottery that s 300 years old, walk past 700-year-old walls, or pass through what remains of the grand portico of an ancient temple to Athena stirs a feeling I can barely explain, of timelessness. Although Rhodes doesn t have the famous Greek sites such as the Parthenon or Delphi, I can t help but wonder why the Rhodes sites aren t better known. They were incredible and it is hard to condense all that we saw into a short blog post. I have to start with the medieval Rhodes Old Town. We got off the bus a few blocks from it one bright morning, and our first task was to find a gate across the moat. Oh yes, A GATE ACROSS THE MOAT. It s a dry moat, and that bridge off in the distance is the gate we were headed to. Outside of the outer wall is a nice quiet walking area. The moat and walls completely surround Old Town and, for the most part, date back about 500 years. The round stones you see on that picture, we were told, were likely surplus from catapults and other projectile weapons. Cross one line of walls and you come to another, with original canons still present. The Knights Hospitaller of St. John, which held Rhodes for a few centuries until the Ottomans captured it, sure knew how to build to impress. The gate we happened to use was Amboise, the Grand Master s Gate. Right there is the stunningly rebuilt landmark Palace of the Grand Master. It is absolutely impossible for any photograph to begin to do this building justice. Between its imported Greek and Roman floors, to the grand nature of everything in it, and the archaeological museum in one corner, it was a fitting start to a visit to Old Town. Here s one of the main staircases. Just near the Palace is quiet courtyard with an old door. Pass through that door and suddenly you re in the midst of the busy Old Town. And among the landmarks in Old Town, the most prominent is Ippoton, the Avenue of the Knights. Along this avenue are the buildings built by the various nationalities of knights, many of which are historical sites in their own. Taken together, it is quite clear why Rhodes is said to be one of the world s best-preserved medieval cities. Down at the other end of Ippoton is the Knights Hospital, which is now part of the archaeological museum. Step off the Avenue a few blocks and you get to some quieter narrow streets just as old, in many cases. On Sunday morning, we were able to visit Mount Filerimos. In contrast to the busy Rhodes, Filerimos had an air of quiet and still to it. It was the site of a monastery, two historic churches, and a landmark Italian cross on the mountaintop. We arrived, and begin our visit with a walk up the quiet stone path. When we got to the top, we walked past this peaceful church. As we walked past the outside, we heard the beautiful music of chant from indoors. We got to step in and listen to mass for a few minutes. In typical fashion, directly in front of the church are two much older sites: one, the ruins of a temple to Athena, and the other a 4th-century Christian bapistery. Rhodes is a popular tourist destination, and of course we saw plenty of popular sites (such as the grandmaster s palace). Filerimos had a few tourists too, but not as many. I frequently like to operate on the plan of going wherever all the tourists aren t. And so, on Filerimos, that meant seeing what was behind the monastery. It started with this peaceful tree-lined path. And the deserted, but intentionally open, gate led to the remains of a Byzantine fortress, which had been a staging area for both the Knights and the Ottomans before their campaigns to capture Rhodes. It also provided incredible views of the surrounding countryside. The first historic site we had visited on our trip was the Acropolis of Lindos, parts of which are 2300 years old. Here s a view of the mountain from the rooftop of the Kalypso, our favorite restaurant in Lindos. The columns of the temple to Athena Lindia are visible, and of course so are the walls. The road up to the acropolis is accessible only on foot or by donkey. It is apparently the only road that has ever been used to get to the acropolis. Here is the partially-restored grand portico to the temple. There s an old Christian church (4th century, if memory serves) at the Acropolis too. The Acropolis makes some pretty good use of natural defenses too. Here s a view from one level of it. There s a manmade wall up there at the very top. And, of course, the beautiful Aegean always in the background. There are lots of cats on Rhodes. Here is a kitten napping at the top of the Lindos Acropolis: Lindos itself is a beautiful town. Here s one of the quieter streets: Notice the pebble steps leading into the houses those intricate pieces of artwork are all over. This post won t be complete without the story of our visit to the Acropolis of Rhodes. We walked there from Old Town. At the Acropolis, there are the remains of a temple to Apollo, an ancient theater, and an ancient stadium where qualifying matches for the Olympics were held. As we got closer to the area, we were repeatedly passed by people dressed in uniforms of various types. And as we got there, we joined a stream of people entering the area. The ancient stadium had apparently thousands of people in it, country names were being read off over the loudspeakers, policemen wielding machine guns were standing by, and we had absolutely no idea what was going on. At this point, you can appreciate the difference between Terah and me. Terah thought that we have no idea what is happening, she was tired from the walk, and so thought we should just leave. I thought that we have no idea what is happening, which is a great reason to stay. So Terah opted to sit and read a bit under some trees while I explored. Here s a view of the stadium as it was emptying out, seen from the theater: I explored the temple and theater, and eventually we were ready to head back. We knew there was a bus back to the New Market (from where we could get a bus back to our hotel), but didn t know where the bus stop was. The obvious place to ask were the policemen, which I thought I would do. Terah thought she would just stay sitting under the trees, on the grounds that the policemen nearest us were all carrying machine guns and perhaps wouldn t like to be disturbed. This led to my cryptic tweet:
Only ONE of us is the kind of person that goes up to guys with machine guns to ask what s happening. Me to Terah today
They told me that it was the preparations for the opening ceremony for a global shooting contest, and also gave me directions to the bus stop.