azonenberg,
@azonenberg@ioc.exchange avatar

Ok, this one is for the discrete gurus out there.

Let N = CRC32(X)

Given N, is it possible to efficiently calculate CRC32(concat(X, Y)) where Y is a known sized, but very long, sequence of 0xFF bytes?

Obviously you can just seed the CRC with N and iterate, feeding 0xFF in each cycle, but is there any kind of shortcut you can take if you know the input is always a 1 bit?

destevez,
@destevez@mastodon.social avatar

@azonenberg nice question! My intuition tells me that there are no major shortcuts. Thinking of the CRC hardware implementation, you initialize the register with N and keep pushing many 1's. This is very similar to an LFSR and also to multiplication on a cyclic group. The intuition is that the register states evolve in a random-like way and it is quite difficult to predict how the register will look like after many steps without doing the calculation for those many steps.

destevez,
@destevez@mastodon.social avatar

@azonenberg this is not a proof whatsoever, and maybe there's something I'm missing and there is a good shortcut. I'll try to think about this in more formal terms to see if there is something that can be proved.

azonenberg,
@azonenberg@ioc.exchange avatar
destevez,
@destevez@mastodon.social avatar

@azonenberg I now see where my intuition failed. The problem is pretty much the same as the following: given an element g in a finite group and a positive integer n, compute g^n (see that answer about matrix multiplication). This can always be done if O(log n) multiplications, even without precomputing anything. Just compute g^2, g^4 = g^2*g^2, ..., g^2^(floor(log_2 n)), and then multiply those powers corresponding to the 1's in the binary expansion of n.

destevez,
@destevez@mastodon.social avatar

@azonenberg what is hard is solving the discrete logarithm problem (some cryptography is based on this). This is, given g and h, find n such that h = g^n. For this, the best algorithm known is brute force, which is computationally prohibitive. In your case, this would be asking how many 0xff bytes you need to get a particular given CRC at the end.

azonenberg,
@azonenberg@ioc.exchange avatar

@destevez Yep. This is discrete exponentiation, far easier :)

8051enthusiast,
@8051enthusiast@mastodon.social avatar

@azonenberg i know you already have like 2 solutions, but i got nerd-sniped by this and wanted to play around with normal bases: https://gist.github.com/8051Enthusiast/80c044b3a6f6ad145617eebe11eb9fb1

it only uses bit shifts/rotates/xor and applies constant bit matrices to an int, so except for the 384 bytes storage requirements for the constant matrices, it should be easy to implement
(the matrices are specific to the zlib crc32)

azonenberg,
@azonenberg@ioc.exchange avatar

@8051enthusiast If you make one with matrices for the Ethernet polynomial I'll give it a try.

philpem,
@philpem@digipres.club avatar

@azonenberg In theory at least, a CRC is a linear-feedback shift register. After a certain number of static bits it's going to start repeating itself. My first answer would be a table lookup indexed by [CRC32(X)] and [N] (where N = number of FF bytes).
The problem is that's going to be a big table. It'll probably also have repetitions.

attie,
@attie@chaos.social avatar

@azonenberg Would it not be far preferable to somehow capture and encode the length of X inside itself?

For example, place a uint32 right at the start, or at some predefined place close to the start (e.g: if you have startup code / interrupt vectors at the start)...

azonenberg,
@azonenberg@ioc.exchange avatar

@attie It gets tricky, I've been thinking of how I'd do it but given the use cases of how the system is being used I'd rather not overcomplicate if I can.

Checksumming all of flash isn't THAT hard.

whitequark,
@whitequark@mastodon.social avatar

@azonenberg cursed solution: synthesize an unrolled combinational CRC32 calculator and then grab the function it optimizes to

azonenberg,
@azonenberg@ioc.exchange avatar

@whitequark I forgot to specify, the length of Y is known but not constant.

So I might need CRC32(X, 253K of 0xff) one minute and CRC32(X, 252K of 0xFF) the next.

Use case is calculating the expected CRC of an entire flash partition given only the data I'm writing into it and the a priori knowledge that it's blank.

jix,
@jix@mathstodon.xyz avatar

@azonenberg @whitequark CRC32-UPDATE(N, 0xff) is an affine function in the input state N when you view N as a 32 element vector over GF(2), i.e. you can represent it as a 32x32 bit matrix for the linear part and a 32 bit vector for the constant offset. Like is common in 3d graphics we can use homogeneous coordinates to represent both together in a 33x33 matrix so we can easily compose things including translations by using only matrix multiplication. You can then use exponentiation by squaring to quickly compute any power of that matrix to get a matrix representation of the state update function for adding any given number of 0xff bytes.

Here is some python code that does that: https://gist.github.com/jix/6fe33ef0900d41f550cd52b33ab88d55

jix,
@jix@mathstodon.xyz avatar

@azonenberg also, if you squint hard enough this is what @whitequark suggested, but done for every power of 2 length (reusing the optimized function of the preceding power of two and exploiting some known properties of crcs to do the optimization) and then you pick the resulting functions for every bit that's set in the number of 0xff bytes you want to add and compose them

sgf,
@sgf@mastodon.xyz avatar

@jix @azonenberg Thank you for coming here and saying what I was going to say, but more mathematically.

+1 to treating the CRC step as a matrix multiplication and doing matrix exponentiation to fast forward in log time.

jix,
@jix@mathstodon.xyz avatar

@azonenberg is this useful to you? I could also write a verilog or C implementation (with a python script to generate a few small tables I'd use for those) if that would help. I wouldn't mind doing that even if you don't end up using that after all as long as you think it'd be worth giving it a try.

azonenberg,
@azonenberg@ioc.exchange avatar

@jix A permissively licensed C implementation would be quite handy if you're willing to throw one together.

I'm using the standard zip/Ethernet CRC32 formulation.

jix,
@jix@mathstodon.xyz avatar

@azonenberg Just pushed to https://github.com/jix/fast_fwd_crc

Has an example that can be run with make run that checks against zlib, and is 0BSD licensed.

Ended up using the polynomial quotient ring approach instead of the matrix approach as it's faster, doesn't need any tables, and, after figuring out the right bit order everywhere, simpler to implement.

The high level idea is still the same as for the matrix approach.

azonenberg,
@azonenberg@ioc.exchange avatar

@jix Thanks! Will check it out after work.

breakin,
@breakin@mastodon.gamedev.place avatar

@azonenberg @whitequark Some googling gave me https://groups.google.com/g/comp.compression/c/SHyr5bp5rtc/m/PP5-pmv9-9sJ. The second answer from the top (by Dr Mark Adler) has a lot of good information in it that seems applicable!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • math
  • magazineikmin
  • rosin
  • Youngstown
  • ngwrru68w68
  • khanakhh
  • slotface
  • InstantRegret
  • mdbf
  • GTA5RPClips
  • kavyap
  • thenastyranch
  • DreamBathrooms
  • everett
  • tacticalgear
  • JUstTest
  • tester
  • Durango
  • cubers
  • ethstaker
  • cisconetworking
  • modclub
  • osvaldo12
  • Leos
  • normalnudes
  • megavids
  • provamag3
  • anitta
  • lostlight
  • All magazines