-
Notifications
You must be signed in to change notification settings - Fork 17
[NOTFORMERGE] Crude attempt at ripping data track(s) #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Absolutely, it's something I've wanted to add support for but never had time. I'd prefer cdio_read_mode1_sectors, bypassing the codecs. |
|
OK, I think this is most of it, what do you reckon? Outstanding issues:
I settled on ripping one at a time for now because some of these discs have persistent read errors on single sectors. I think this is copy protection with deliberate Mode1 ECC errors (the lower level C2 seems correct), and by default the drive still checks these even if you're doing a 2352 byte read which returns all that ECC data. It's supposed to be possible to disable these by poking the error recovery page (page 1) using MODE SELECT 10, but on my drive at least I can set it but it doesn't change the behaviour at all. redumper works around this by just reading sectors in audio mode and aligning them using the drive offset; they're retried only if they didn't have clean C2. A civilised implementation would check the ECC and retry the read a few times to ensure it was accurate, but there's no ECC implementation in libcdio, and it's too complicated to DIY. Ripping is also ~10x slower than redumper on the same drive (100 vs. 1000 sectors/sec), even though both issue reads one sector at a time. libcdio tells me setting speed is unsupported, but I can do |
If you ask for autodetect or data sector type, the drive tries to verify the ECC for you, even if you ask it not to. This means sectors with ECC errors due to mastering can't be archived. By reading as audio and fixing the offset up ourselves, we can avoid the issue.
Testing on more discs it looks like some have a different offset in the second session? Anyway, just sync up and read like a real data drive would.
| frameptr = buf + offset_bytes; | ||
| if (memcmp(&buf[offset_bytes], CDIO_SECTOR_SYNC_HEADER, CDIO_CD_SYNC_SIZE)) { | ||
| printf("did not find sync header at lsn %d\n", next_lsn); | ||
| exit(1); // needs strategy to handle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return an error and let it propagate all the way back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my thinking this needs a bigger conversation, because reading data turns out to be a bit of a rabbit hole!
At the moment the other tracks are ripped with paranoia, which AIUI reads each sector at least twice? and rereads if they do not match. Meanwhile this code reads sectors only once and only has retry logic for the case where the drive reports a C2 error (or is supposed to - I haven't seen this happen yet). So I think this suffers from a lack of robustness: if a read is erroneous but doesn't fail C2, we can:
- not get a sync header where there was one (currently falls over)
- get the wrong frame msf (will also fail messily, though it might recover)
- silently write bad data into the output file
The nicest option would be to use paranoia to do the data reading, which would at least make sure it reads the same twice. I've tried rigging this up but, weirdly, the paranoia reads eventually jump a sector - which is exactly what it's not supposed to do? I have a number of discs where paranoia is happy with its read but it fails accurip until I retry it a few times, maybe it is an upstream bug relating to modern drives. Anyway, that's here: https://github.com/abrasive/cyanrip/tree/data-via-paranoia
With some further testing I also find that some drives outright refuse to read data sectors as audio. So I guess that leaves us with the simplest, least bad option of reading in "cooked" mode, which takes care of synchronisation etc. for us, but will return no sector data for any sector that the drive doesn't recognise. For the purposes of archiving interactive / PC-accessible extra content that sounds pretty acceptable I reckon.
| if (ctx->settings.ripping_retries) { | ||
| int matches = 0; | ||
| for (int i = 0; i < nb_last_checksums; i++) | ||
| matches += last_checksums[i] == checksum_ctx.eac_crc; | ||
|
|
||
| total_repeats++; | ||
| if (matches >= ctx->settings.ripping_retries) { | ||
| cyanrip_log(ctx, 0, "\nDone; (%i out of %i matches for current checksum %08X)\n", | ||
| matches, ctx->settings.ripping_retries, checksum_ctx.eac_crc); | ||
| goto finalize_ripping; | ||
| } | ||
| if (total_repeats >= ctx->settings.max_retries) { | ||
| cyanrip_log(ctx, 0, "\nDone; (no matches found, but hit repeat limit of %i)\n", | ||
| ctx->settings.max_retries); | ||
| goto finalize_ripping; | ||
| } | ||
|
|
||
| cyanrip_log(ctx, 0, "\nRepeating ripping (%i out of %i matches for current checksum %08X)\n", | ||
| matches, ctx->settings.ripping_retries, checksum_ctx.eac_crc); | ||
|
|
||
| last_checksums = av_realloc(last_checksums, | ||
| (nb_last_checksums + 1)*sizeof(*last_checksums)); | ||
| if (!last_checksums) { | ||
| ret = AVERROR(ENOMEM); | ||
| goto out; | ||
| } | ||
|
|
||
| last_checksums[nb_last_checksums] = checksum_ctx.eac_crc; | ||
| nb_last_checksums++; | ||
|
|
||
| for (int i = 0; i < ctx->settings.outputs_num; i++) | ||
| fseek(binfps[i], 0, SEEK_SET); | ||
|
|
||
| goto repeat_ripping; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you export this as a function which gets used by both functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, should we do the same with progress / time estimation as well?
Hi, thank you so much for all your amazing work on this!
I've got a number of discs that have a data track, usually in a second session as the last track, but in at least one case in a single session as the first track (!). It'd be lovely to have them ripped along with the audio content.
At the moment I get a .bin file, which is a beautiful FLAC with no content, so I took a poke at it and now I get a FLAC with an entire ISO in it! Plus a bunch of other junk... Ripping multiple times with -Z does let me checksum and repeat, which is nice.
I got a bit stuck on libcdio though. Dumping it as a 2352 byte bin (matching the MODE1/2352 you are currently writing to the cue) would be the most traditional archiving method, but libcdio doesn't officially support it! Using the audio read methods sort of works, but I get 36 bytes at the start of each sector that don't belong there... I guess that's what the sync header is for but it also smells a bit like that's going to be drive dependent and messy down the track (if you'll pardon the pun, I promise that wasn't on purpose!)
The other option is to suck it up and do 2048 byte sectors with
cdio_read_mode1_sectors. This seems to work just fine, no sync issues, most portable, and gives you an .iso directly; an enthusiastic person would even use cdio's libiso9660 to unpack it in situ. But that means either duplicating or hacking up the ripping loop to handle 2048 byte frames, bypassing the codecs (or can we force data tracks to S16 PCM as a passthrough?) and ideally reading (a lot) more than one frame at a time.What do you think? Is this a feature you would like? How would you like to see it implemented?