Skip to content

Conversation

@abrasive
Copy link

@abrasive abrasive commented Apr 9, 2025

Hi, thank you so much for all your amazing work on this!

I've got a number of discs that have a data track, usually in a second session as the last track, but in at least one case in a single session as the first track (!). It'd be lovely to have them ripped along with the audio content.

At the moment I get a .bin file, which is a beautiful FLAC with no content, so I took a poke at it and now I get a FLAC with an entire ISO in it! Plus a bunch of other junk... Ripping multiple times with -Z does let me checksum and repeat, which is nice.

I got a bit stuck on libcdio though. Dumping it as a 2352 byte bin (matching the MODE1/2352 you are currently writing to the cue) would be the most traditional archiving method, but libcdio doesn't officially support it! Using the audio read methods sort of works, but I get 36 bytes at the start of each sector that don't belong there... I guess that's what the sync header is for but it also smells a bit like that's going to be drive dependent and messy down the track (if you'll pardon the pun, I promise that wasn't on purpose!)

The other option is to suck it up and do 2048 byte sectors with cdio_read_mode1_sectors. This seems to work just fine, no sync issues, most portable, and gives you an .iso directly; an enthusiastic person would even use cdio's libiso9660 to unpack it in situ. But that means either duplicating or hacking up the ripping loop to handle 2048 byte frames, bypassing the codecs (or can we force data tracks to S16 PCM as a passthrough?) and ideally reading (a lot) more than one frame at a time.

What do you think? Is this a feature you would like? How would you like to see it implemented?

@cyanreg
Copy link
Owner

cyanreg commented Apr 9, 2025

Absolutely, it's something I've wanted to add support for but never had time.
I'd prefer for it to output raw .bin, such that it matches the .cue file, unless a switch is given, in which case it would output .iso.

I'd prefer cdio_read_mode1_sectors, bypassing the codecs.

@abrasive
Copy link
Author

OK, I think this is most of it, what do you reckon?

Outstanding issues:

  • Progress / logging are not implemented - I defer to your personal tastes on this
  • The "Tracks ripped accurately: X/Y" probably shouldn't include data tracks in Y
  • Ripping data is currently done one sector at a time

I settled on ripping one at a time for now because some of these discs have persistent read errors on single sectors. I think this is copy protection with deliberate Mode1 ECC errors (the lower level C2 seems correct), and by default the drive still checks these even if you're doing a 2352 byte read which returns all that ECC data. It's supposed to be possible to disable these by poking the error recovery page (page 1) using MODE SELECT 10, but on my drive at least I can set it but it doesn't change the behaviour at all.

redumper works around this by just reading sectors in audio mode and aligning them using the drive offset; they're retried only if they didn't have clean C2. A civilised implementation would check the ECC and retry the read a few times to ensure it was accurate, but there's no ECC implementation in libcdio, and it's too complicated to DIY.

Ripping is also ~10x slower than redumper on the same drive (100 vs. 1000 sectors/sec), even though both issue reads one sector at a time. libcdio tells me setting speed is unsupported, but I can do mmc_set_speed(ctx->cdio, 0xffff, 0) and it speeds right up... Fix upstream or just use mmc_set_speed instead?

abrasive added 3 commits April 10, 2025 19:34
If you ask for autodetect or data sector type, the drive tries to verify
the ECC for you, even if you ask it not to. This means sectors with ECC
errors due to mastering can't be archived.

By reading as audio and fixing the offset up ourselves, we can avoid the
issue.
Testing on more discs it looks like some have a different offset in the
second session? Anyway, just sync up and read like a real data drive
would.
frameptr = buf + offset_bytes;
if (memcmp(&buf[offset_bytes], CDIO_SECTOR_SYNC_HEADER, CDIO_CD_SYNC_SIZE)) {
printf("did not find sync header at lsn %d\n", next_lsn);
exit(1); // needs strategy to handle
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return an error and let it propagate all the way back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my thinking this needs a bigger conversation, because reading data turns out to be a bit of a rabbit hole!

At the moment the other tracks are ripped with paranoia, which AIUI reads each sector at least twice? and rereads if they do not match. Meanwhile this code reads sectors only once and only has retry logic for the case where the drive reports a C2 error (or is supposed to - I haven't seen this happen yet). So I think this suffers from a lack of robustness: if a read is erroneous but doesn't fail C2, we can:

  1. not get a sync header where there was one (currently falls over)
  2. get the wrong frame msf (will also fail messily, though it might recover)
  3. silently write bad data into the output file

The nicest option would be to use paranoia to do the data reading, which would at least make sure it reads the same twice. I've tried rigging this up but, weirdly, the paranoia reads eventually jump a sector - which is exactly what it's not supposed to do? I have a number of discs where paranoia is happy with its read but it fails accurip until I retry it a few times, maybe it is an upstream bug relating to modern drives. Anyway, that's here: https://github.com/abrasive/cyanrip/tree/data-via-paranoia

With some further testing I also find that some drives outright refuse to read data sectors as audio. So I guess that leaves us with the simplest, least bad option of reading in "cooked" mode, which takes care of synchronisation etc. for us, but will return no sector data for any sector that the drive doesn't recognise. For the purposes of archiving interactive / PC-accessible extra content that sounds pretty acceptable I reckon.

Comment on lines +662 to +696
if (ctx->settings.ripping_retries) {
int matches = 0;
for (int i = 0; i < nb_last_checksums; i++)
matches += last_checksums[i] == checksum_ctx.eac_crc;

total_repeats++;
if (matches >= ctx->settings.ripping_retries) {
cyanrip_log(ctx, 0, "\nDone; (%i out of %i matches for current checksum %08X)\n",
matches, ctx->settings.ripping_retries, checksum_ctx.eac_crc);
goto finalize_ripping;
}
if (total_repeats >= ctx->settings.max_retries) {
cyanrip_log(ctx, 0, "\nDone; (no matches found, but hit repeat limit of %i)\n",
ctx->settings.max_retries);
goto finalize_ripping;
}

cyanrip_log(ctx, 0, "\nRepeating ripping (%i out of %i matches for current checksum %08X)\n",
matches, ctx->settings.ripping_retries, checksum_ctx.eac_crc);

last_checksums = av_realloc(last_checksums,
(nb_last_checksums + 1)*sizeof(*last_checksums));
if (!last_checksums) {
ret = AVERROR(ENOMEM);
goto out;
}

last_checksums[nb_last_checksums] = checksum_ctx.eac_crc;
nb_last_checksums++;

for (int i = 0; i < ctx->settings.outputs_num; i++)
fseek(binfps[i], 0, SEEK_SET);

goto repeat_ripping;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you export this as a function which gets used by both functions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, should we do the same with progress / time estimation as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants