Samuel Neves

1.3K posts

Samuel Neves

Samuel Neves

@sevenps

شامل ہوئے Aralık 2010
241 فالونگ645 فالوورز
Samuel Neves
Samuel Neves@sevenps·
@ciphergoth @oconnor663 @cryptodavidw It wouldn't make much sense to benchmark universal hashes against unkeyed collision-resistant hash functions. But an almost (Δ-)universal hash section on eBACS could be useful on its own.
English
1
0
0
0
Jack O'Connor
Jack O'Connor@oconnor663·
Bragging: In this SUPERCOP benchmark of long inputs on an AVX-512 machine, BLAKE3 is the fastest hash function. Not the fastest cryptographic hash function. The fastest hash function. #amd64-ygritte" target="_blank" rel="nofollow noopener">bench.cr.yp.to/results-hash.h…
English
5
8
34
0
JP Aumasson
JP Aumasson@veorq·
what are some examples of bad puns in crypto papers titles? things like "What the Fork: Implementation Aspects of a Forkcipher" and "Does gate count matter? Hardware efficiency of (..)"
English
25
6
27
0
Matthew Green
Matthew Green@matthew_d_green·
In 2005 my research group reverse-engineered an automotive cipher called DST40. Fifteen years later Tesla is using a variant of the same cipher: the DST80. Spoiler: it is not 2^40 times as strong. tches.iacr.org/index.php/TCHE…
English
3
34
128
0
Samuel Neves
Samuel Neves@sevenps·
@kste_ @ciphergoth @veorq @WatsonLadd @SchmiegSophie The best trail (the usual caveats apply) for Salsa jumps from 2^-18 to 2^-46 for 3 to 4 rounds; the best trail for Chacha jumps from 2^-12 to 2^-39. But restricting the differences to the attacker-controlled 128 bits instead of the entire space would greatly decrease these probs.
English
1
0
2
0
Samuel Neves
Samuel Neves@sevenps·
@SeanieCurran @veorq The comparison formulas there were derived independently and, if I remember correctly, unsigned < requires one fewer operation than Hacker's Delight.
English
0
0
0
0
JP Aumasson
JP Aumasson@veorq·
the "crypto coding rules" are back at github.com/veorq/cryptoco… originally started this in 2013, haven't touched it in years, just did some cleanup and update but still lot of work needed! PRs welcome :)
English
4
82
186
0
Samuel Neves
Samuel Neves@sevenps·
@chrisrohlf Probably a similar interface to WRMSR or XSETBV: register index in ECX, upper bits in EDX. Since there's only one 32-bit register so far, both are hardcoded to 0.
English
0
0
0
0
Samuel Neves
Samuel Neves@sevenps·
@kode54 The answer is no. The construction appears deceptively simple, but its security is not in question.
English
0
0
0
0
Matthew Green
Matthew Green@matthew_d_green·
@mjos_crypto Has anyone ever optimized these ciphers to work more efficiently when enciphering sequential counters as opposed to CBC/OCB where you have to feed actual plaintext into the cipher?
English
3
0
0
0
mjos\dwez @m-jos.bsky.social
mjos\dwez @m-jos.bsky.social@mjos_crypto·
Thanks to inherent parallelism of AES-GCM (its only saving grace), future AVX512 CPUs can encrypt/decrypt+authenticate four AES blocks in parallel with VAESENC, VAESENCLAST, and VPCLMULQDQ. Why they're wasting huge amounts area to VAESDEC, VAESDECLAST is a mystery (not needed).
English
1
0
12
0
Jack O'Connor
Jack O'Connor@oconnor663·
@sevenps I wonder if that would "hardcode" too many of the particular features of the BLAKE2 compression function. For example, would this general interface take a "root node" or flag or a "leaf vs parent" IV parameter?
English
1
0
1
0
Jack O'Connor
Jack O'Connor@oconnor663·
@zooko @sevenps the latest benchmarks at github.com/oconnor663/bao… have BLAKE2s beating BLAKE2b after all. Both versions benefit from keeping the state words in transposed form while hashing multiple inputs, to avoid transposing them over and over. But BLAKE2s benefits much more.
English
1
0
0
0
Samuel Neves
Samuel Neves@sevenps·
@oconnor663 I meant specify Bao in terms of a compression function (e.g., the one underlying blake2*) instead of a variable input size hash function.
English
1
0
1
0
Jack O'Connor
Jack O'Connor@oconnor663·
@sevenps Could you clarify "to specify the hash"? Do you mean like exposing a Bao API that takes a compression function as a parameter?
English
1
0
0
0
Samuel Neves
Samuel Neves@sevenps·
@oconnor663 rdtsc(p) no longer counts cycles in most chips; it is a timer that runs at the nominal frequency of the processor, but the processor itself can clock higher or lower. So you need to force it to also run at the nominal frequency to have reasonably accurate cycle counts.
English
0
0
1
0
Samuel Neves
Samuel Neves@sevenps·
@oconnor663 Another thing---those (particularly the single-threaded) numbers are either too good to be true, or you're not actually disabling Turbo Boost for measuring.
English
1
0
0
0
Samuel Neves
Samuel Neves@sevenps·
@oconnor663 There's little point in an AVX2 implementation of BLAKE2s, beyond taking advantage of AVX512F's native rotation instructions and such. On another note, have you considered using the compression function directly to specify the hash?
English
2
0
0
0
Samuel Neves
Samuel Neves@sevenps·
@oconnor663 @zooko NEON should make a big difference, seeing that it has native 64-bit addition. On SUPERCOP, blake2b generally outperforms blake2s where NEON is present, e.g., #armeabi-pi2" target="_blank" rel="nofollow noopener">bench.cr.yp.to/results-hash.h… On the other hand, blake2s does not generally benefit from NEON, but tree'd blake2s might.
English
0
0
1
0
Jack O'Connor
Jack O'Connor@oconnor663·
@zooko @sevenps I just put up some preliminary benchmark results for 32-bit ARM at github.com/oconnor663/bao…. As expected, BLAKE2s dramatically outperforms BLAKE2b. I don't know if NEON would affect things in either direction, but I haven't ported anything yet.
English
1
0
1
0
Samuel Neves
Samuel Neves@sevenps·
@oconnor663 @zooko Twitter is really not the best medium for this. Everything's out of order. blake2sp is essentially the same speed as blake2bp but is more sensitive to compiler codegen quirks, so depending on compiler version/flags it is often slower.
English
1
0
0
0
Jack O'Connor
Jack O'Connor@oconnor663·
@zooko @sevenps God so many threading fails :p Believe it or not this is my first long Twitter thread.
English
2
0
1
0
Jack O'Connor
Jack O'Connor@oconnor663·
@zooko I've been working on a tree hash based on BLAKE2b, and it's at the point where it needs a review from a Real Cryptographer. Do you know anyone who might be interested in collaborating on something like that? github.com/oconnor663/bao
English
2
0
4
0
Samuel Neves
Samuel Neves@sevenps·
@oe1cxw @rygorous Neat, the high part does the inversion itself. You can also compute the xor of any number of rotations of a word; for example SHA-256's S1 is doable as clmul(e, 0x4200080) ^ clmulh(e, 0x4200080).
English
0
0
1
0
Samuel Neves
Samuel Neves@sevenps·
@oe1cxw @rygorous Since the Gray code is bit-reversed polynomial multiplication by x + 1, whose inverse modulo x^32 is all 1s, you can also have grev32(clmul32(grev32(x ^ (x >> 1), 31), -1), 31) == x.
English
1
0
1
0
Paul Crowley
Paul Crowley@ciphergoth·
Ever seen an algorithm so neat you have to share it? This is just the best way I've ever seen to generate every permutation of a list, and I can't find it online anywhere else. gist.github.com/ciphergoth/3d8…
English
1
0
4
0