Skip to content

Support Kitty graphics protocol mvp#5619

Draft
anthonykim1 wants to merge 48 commits intomasterfrom
anthonykim1/scaffoldKittyAddon
Draft

Support Kitty graphics protocol mvp#5619
anthonykim1 wants to merge 48 commits intomasterfrom
anthonykim1/scaffoldKittyAddon

Conversation

@anthonykim1
Copy link
Contributor

@anthonykim1 anthonykim1 commented Jan 22, 2026

Part of #5592

WIP

Plan:

  1. (Done with Add register apc handler  #5617) First PR: Add registerAPCHandler to parser and expose in API, add tests
  2. Scaffold addon:
    • Create addon-kitty-graphics, copy structure from other addons
    • Hook up in demo client.ts
      • Search addon-progress and ProgressAddon for how to do this
    • Get it working and activating
      • (Checked, remember to remove comments) Add a console.log to the addon's activate function
  3. Come up with a simple test command to verify in kitty or ghostty (see send-png)
  4. Set up APC handler, get it to trigger with the test
  5. Create a playwright test (test folder) in the addon that automates it for a simple image
    • Use single 1x1 black pixel png image, write it to the terminal in your test, verify the image we see is black
    • Use 3x1 png image, 255 red first, 255 green second, 255 blue third
  6. Understand what APC sequences are being sent by the program
  7. Implement the handlers
  8. Draw to a canvas layer
  9. Make sure we get the playwright tests to pass.

(Jan 22, 4:00 PM PST) There's a problem currently where kitty kitten +icat wont return image but using send-image file will.

Copilot's research on why normal kitty kitten +icat doesnt work on xterm demo rn:

kitty icat uses TIOCGWINSZ ioctl to get pixel dimensions from the PTY before sending any graphics. node-pty doesn't set pixel values in that ioctl, so it returns 0. kitty icat sees 0 pixels and errors out before even trying the graphics protocol.

(Jan 22, 6:30 PM PST)
Things like: kitty +kitten icat --use-window-size=80,24,1280,768 --transfer-mode=stream ~/Desktop/xterm.js/demo/logo.png work atm

TODO since image goes on top of text rn 🥶:

Calculate how many rows/cols the image spans
Move the cursor accordingly (or reserve blank space)

TODOs:

  • make sure wasm perf hasnt regressed
  • Make sure cursor is on the right place
  • Make sure replacement, deletion work
  • Make sure all the other feedbacks are addressed.
  • bit shifting things mentioned below
  • File system support probably need to hook them up with vscode.
  • Make sure placement works correctly with arguments, check other apps and how they place images.
  • Share decoder instance: Support Kitty graphics protocol mvp #5619 (comment)
  • Get the MVP out on feb!!

@anthonykim1 anthonykim1 self-assigned this Jan 22, 2026
@anthonykim1 anthonykim1 marked this pull request as draft January 22, 2026 19:57
@anthonykim1 anthonykim1 changed the title Support Kitty graphics protocol Support Kitty graphics protocol mvp Jan 22, 2026
@jerch
Copy link
Member

jerch commented Jan 23, 2026

...node-pty doesn't set pixel values in that ioctl...

Imho this should be fixed in node-pty to populate the pixel dimensions where possible (I think there was an issue with conpty not providing it, but for other platforms it can be populated)

Another workaround for console apps is to use WinOps sequences like CSI 14t (see here for an example https://github.com/jerch/sixel-puzzle/blob/56129538af70fa4ec9441d1d3553398dc8e66f1f/termlib.py#L95) But of couse this is beyond xterm.js and has to be implemented by the console app itself.

Comment on lines 227 to 231
const binaryString = atob(base64Data);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
Copy link
Member

@jerch jerch Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has very bad runtime and is a nightmare creating memory pressure with huge images.

You might want to check the wasm based base64 decoder I used for the IIP protocol. It works on chunks, thus should be much friendlier to chunked data ingestion.

For basic usage see https://github.com/jerch/xterm-wasm-parts, for chunked usage addons/addon-image/src/IIPHandler.ts holds the needed bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Am I going in right direction with: bae314f

Copy link
Member

@jerch jerch Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following this link https://sw.kovidgoyal.net/kitty/graphics-protocol/#the-graphics-escape-code the base form of all kitty sequences is

<ESC>_G<control data>;<payload><ESC>\

so the delimiter between control data and payload is a semicolon. Next you should clarify, whether the semicolon is an allowed char in control data itself. If not, then the separation of control data vs. payload is unique and you can just scan for the first semicolon in the byte stream. If a semicolon is allowed in control data, then the separation gets more ugly - prolly the = can be used as sentinel, it will not occur in base64 beside at the very end for padding.

About the put method - it gets called for every single chunk of sequence payload (here <control data>;<payload>). It is important to know, that the data can be split arbitrarily (worst case - one byte per put invocation) - so you have to collect the data.

Now to solve the separation of control data vs. base64 payload - I would assign a static Uint32Array to copy chunks over until the semicolon was found, schematically:

private _inControlCode = true; // must be reset in start()
private _controlData = new Uint32Array(4096);  // or another size that exceeds expectations
private _controlLength = 0; // must be reset in start()

public put(data: Uint32Array, start: number, end: number): void {
  if (this._aborted) return;

  // perf note: in C/C++ the following then-else branches should be flipped
  // as the payload branch is more likely (no clue if it makes a difference in JS)
  if (this._inControlCode) {
    let controlEnd = end;
    // scan for payload delimiter
    for (let i = start, i < end; ++i) {
      if (data[i] === 0x3b) {
        // ; was found, so here we have all control code data
        this._inControlCode = false;
        controlEnd = i;
        break;
      }
    }
    // copy control data over
    this._controlData.set(data.subarray(start, controlEnd), this._controlLength);
    this._controlLength += controlEnd - start;
    if (!this._inControlCode) {
      // this._controlData now contains all control data, as we finished it above
      this._parseControlData();
      // init the base64 decoder
      this._b64Decoder.init(786432);
      // we still may have base64 payload in the chunk
      // +1 is needed to skip the ; itself
      if (controlEnd+1 < end) {
        if (this._b64Decoder.put(data, controlEnd+1, end)) {
          this._aborted = true;
        }
      }
    }
  } else {
    // base64 payload
    if (this._b64Decoder.put(data, start, end)) {
      this._aborted = true;
    }
  }
}

Now this 1MB sharding makes the usage of the base64 decoder a bit more cumbersome. I suggest to wrap the b64Decoder.put calls to do the tracking transparently, schematically:

private _shardSize = 0;
private _b64put(data: Uint32Array, start: number, end: number): boolean {
  // is there still room in the base64 shard?
  const free = BASE64_SHARD_SIZE - this._shardSize;
  if (free >= end - start) {
    this._shardSize += end - start;
    return this._b64Decoder.put(data, start, end);
  }
  // fill the shard
  if (this._b64Decoder.put(data, start, start + free)) {
    return true;
  }
  // grab the decoded data
  this._b64Decoder.end();
  const chunk = this._b64Decoder.data8;  // borrowed!
  // --> store the chunk somehow...

  // handle leftover data:
  // - restart decoder for next shard decoding
  // - feed data to decoder
  this._b64Decoder.init(786432);
  this._shardSize = 0;
  return this._b64Decoder.put(data, start + free, end));
}

This last snippet also shows what to do in the end method:

public end(success: boolean): boolean | Promise<boolean> {
  if (this._aborted) return true;

  // finalize last base64 shard
  this._b64Decoder.end();
  const chunk = this._b64Decoder.data8;  // borrowed!
  // --> store the chunk somehow...

  if (success) {
    // process image data chunks further here...
  }
  // cleanup chunks here...
  return true;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw that the IIP handler is still on an old version of the base64 decoder. It refers to the very first commit in this repo 9a0692b. Please dont change this yet, as there were changes in API. Seem I need to cleanup this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw, that there are also sequences, where the payload is omitted or not base64 encoded. For not base64 payload you basically would need to eval the control data before spinning up the base64 stuff.

For sequences without any payload the semicolon as delimiter is also absent. To properly grab those sequences you can do it in the end method, schematically:

public end(success: boolean): boolean | Promise<boolean> {
  if (this._aborted) return true;

  // sequence without any payload
  if (this._inControlCode) {
    if (!success) return true;
    this._parseControlData();
    // exec the control commands if legit
    ...
    return true;
  }

  // payload handling goes here
  ...
}

Copy link
Contributor Author

@anthonykim1 anthonykim1 Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Got lots of inspiration from this for e4710d3 and a7c4a16

Im kind of hesitant to get rid of the full shared -> decode now, since tracking the bytes separately seem intimidating tho 🥲

Copy link
Member

@jerch jerch Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry overlooked this comment yesterday - well the full shard decoding introduces another copy, which will introduce lag spikes, when the decoder gets the shard for decoding.

Would it help to introduce a property on the decoder, that gives you the amount of loaded base64 bytes? Something like this:

interface Decoder {
  ...
  // max allowed base64 bytes
  maxBytes: number;
  // loaded base64 bytes
  loadedBytes: number;
  // free base64 bytes (maxBytes - loadedBytes)
  freeBytes: number;
}

@jerch
Copy link
Member

jerch commented Jan 23, 2026

@anthonykim1 I start to wonder, if the kitty graphics handler should be integrated into the image addon. I don't know the requirements of this protocol for the render layering, but maybe things could be reused and unified? It would make the lifecycling easier and would not introduce another image layer.

But as I said - i dont know the kitty protocol requirements, so fusing it might be a futile attempt. Thoughts?

Comment on lines 16 to 18
// Load test images as base64
const BLACK_1X1_BASE64 = readFileSync('./addons/addon-kitty-graphics/fixture/black-1x1.png').toString('base64');
const RGB_3X1_BASE64 = readFileSync('./addons/addon-kitty-graphics/fixture/rgb-3x1.png').toString('base64');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also want tests for f=24 and 32

Copy link
Member

@jerch jerch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments from my side...

Comment on lines 566 to 573
for (let i = 0; i < pixelCount; i++) {
data[dstOffset ] = bytes[srcOffset ]; // R
data[dstOffset + 1] = bytes[srcOffset + 1]; // G
data[dstOffset + 2] = bytes[srcOffset + 2]; // B
data[dstOffset + 3] = isRgba ? bytes[srcOffset + 3] : ALPHA_OPAQUE;
srcOffset += bytesPerPixel;
dstOffset += BYTES_PER_PIXEL_RGBA;
}
Copy link
Member

@jerch jerch Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still subpar - for RGBA data you can simply use the bytes directly, for RGB interleaving with the 4th byte in uint32 blocks might be faster, schematically:

if (format === KittyFormat.RGBA) {
  imgData = new ImageData(bytes, width, height);
} else {
  const bytes32 = new Uint32Array(BYTES.buffer);
  const data32 = new Uint32Array(DATA.buffer);
  let dstOffset = 0;
  let srcOffset = 0;
  // assuming little endian, PIXELS must be 4-aligned
  for (let i = 0; i < PIXELS; i += 4) {
    const bloc1 = bytes32[srcOffset++];
    const bloc2 = bytes32[srcOffset++];
    const bloc3 = bytes32[srcOffset++];
    data32[dstOffset++] = bloc1 | 0xFF000000;
    data32[dstOffset++] = (bloc1 >> 24) | (bloc2 << 8) | 0xFF000000;
    data32[dstOffset++] = (bloc2 >> 16) | (bloc3 << 16) | 0xFF000000;
    data32[dstOffset++] = (bloc3 >> 8) | 0xFF000000;
  }
  // handle leftover here...
}

This is 28 vs. 7 reads/writes per 4 pixels (memory access is a lot heavier than the bit manipulations).

On a sidenote - this byte interleaving is a typical task where wasm-simd would outperform by far (can do it with 3 instructions - load, swizzle, store).


Edit: Fixing the bit shifts. Timing is 60 ms vs. 10 ms for 4M pixels (speedup 5-6x, or 0.2 GB/s vs. 1.3 GB/s, wasm-simd would reach here >4GB/s).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a gist of different versions: https://gist.github.com/jerch/e2f7695b887228d9e703c53af23c0737

Results:

$> node lib/test.wasm.js
original 50.4 ms, 250 MB/s
slightly optimized 23.9 ms, 526 MB/s
4 byte blocks 8.1 ms, 1553 MB/s
wasm simd 2.65 ms, 4748 MB/s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is crazy.. 50 -> 2.6 🤯🤩🔥

@anthonykim1
Copy link
Contributor Author

anthonykim1 commented Jan 29, 2026

I start to wonder, if the kitty graphics handler should be integrated into the image addon. I don't know the requirements of this protocol for the render layering, but maybe things could be reused and unified? It would make the lifecycling easier and would not introduce another image layer.
But as I said - i dont know the kitty protocol requirements, so fusing it might be a futile attempt. Thoughts?

@jerch Thank you for the feedback :)
Will give this a shot! I started moving things around, but not done yet 🥸

@jerch
Copy link
Member

jerch commented Feb 1, 2026

Sidenote on base64 decoding - ECMA2026 brings Uint8Array.frombase64 & Uint8Array.setfromBase64 (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array/fromBase64)

Need to re-eval things for the base64 decoder + encoder, on a first glance they are reasonably fast enough (tested with node 25):

   Context "lib/base64/base64.benchmark.js"
      Context "Base64 - decode"
         Context "Node - Buffer"
            Case "256" : 100 runs - average throughput: 56.09 MB/s
            Case "4096" : 100 runs - average throughput: 465.91 MB/s
            Case "65536" : 100 runs - average throughput: 1215.26 MB/s
            Case "1048576" : 100 runs - average throughput: 1663.89 MB/s
         Context "Node - Uint8Array.fromBase64"
            Case "256" : 100 runs - average throughput: 47.96 MB/s
            Case "4096" : 100 runs - average throughput: 613.69 MB/s
            Case "65536" : 100 runs - average throughput: 1398.99 MB/s
            Case "1048576" : 100 runs - average throughput: 1373.99 MB/s
         Context "Base64Decoder"
            Case "256" : 100 runs - average throughput: 72.56 MB/s
            Case "4096" : 100 runs - average throughput: 586.40 MB/s
            Case "65536" : 100 runs - average throughput: 1477.00 MB/s
            Case "1048576" : 100 runs - average throughput: 1515.69 MB/s
      Context "Base64 - encode"
         Context "Node - Buffer"
            Case "256" : 100 runs - average throughput: 30.34 MB/s
            Case "4096" : 100 runs - average throughput: 357.81 MB/s
            Case "65536" : 100 runs - average throughput: 1451.27 MB/s
            Case "1048576" : 100 runs - average throughput: 1992.45 MB/s
         Context "Node - Uint8Array.toBase64"
            Case "256" : 100 runs - average throughput: 66.50 MB/s
            Case "4096" : 100 runs - average throughput: 865.88 MB/s
            Case "65536" : 100 runs - average throughput: 2996.19 MB/s  <-- WTH?
            Case "1048576" : 100 runs - average throughput: 1086.55 MB/s
         Context "Base64Encoder"
            Case "256" : 100 runs - average throughput: 60.01 MB/s
            Case "4096" : 100 runs - average throughput: 521.15 MB/s
            Case "65536" : 100 runs - average throughput: 1415.68 MB/s
            Case "1048576" : 100 runs - average throughput: 1564.53 MB/s

There is an issue with the wasm impls in node 25, node 20 is much faster:

   Context "lib/base64/base64.benchmark.js"
      Context "Base64 - decode"
         Context "Node - Buffer"
            Case "256" : 100 runs - average throughput: 39.19 MB/s
            Case "4096" : 100 runs - average throughput: 313.93 MB/s
            Case "65536" : 100 runs - average throughput: 504.64 MB/s
            Case "1048576" : 100 runs - average throughput: 570.46 MB/s
         Context "Base64Decoder"
            Case "256" : 100 runs - average throughput: 46.53 MB/s
            Case "4096" : 100 runs - average throughput: 654.45 MB/s
            Case "65536" : 100 runs - average throughput: 2100.95 MB/s
            Case "1048576" : 100 runs - average throughput: 2213.01 MB/s
      Context "Base64 - encode"
         Context "Node - Buffer"
            Case "256" : 100 runs - average throughput: 24.87 MB/s
            Case "4096" : 100 runs - average throughput: 331.30 MB/s
            Case "65536" : 100 runs - average throughput: 1211.42 MB/s
            Case "1048576" : 100 runs - average throughput: 1684.94 MB/s
         Context "Base64Encoder"
            Case "256" : 100 runs - average throughput: 69.87 MB/s
            Case "4096" : 100 runs - average throughput: 587.26 MB/s
            Case "65536" : 100 runs - average throughput: 2078.57 MB/s
            Case "1048576" : 100 runs - average throughput: 2259.20 MB/s

Edit: There is indeed a serious performance issue in node 24. All wasm execution suffers from this, but with SIMD it gets really bad:

jerch@linpad:~/xterm-wasm-parts$ nvm use 24
Now using node v24.13.0 (npm v11.6.2)
jerch@linpad:~/xterm-wasm-parts$ node lib/base64/Base64Decoder.wasm.js
duration: 508 ms, rate: 2064 MB/s
duration: 493 ms, rate: 2126 MB/s
duration: 522 ms, rate: 2008 MB/s
jerch@linpad:~/xterm-wasm-parts$ node lib/base64/Base64Decoder.wasm.js
duration: 517 ms, rate: 2027 MB/s
duration: 482 ms, rate: 2177 MB/s
duration: 665 ms, rate: 1577 MB/s
jerch@linpad:~/xterm-wasm-parts$ nvm use 22
Now using node v22.22.0 (npm v10.9.4)
jerch@linpad:~/xterm-wasm-parts$ node lib/base64/Base64Decoder.wasm.js
duration: 254 ms, rate: 4124 MB/s
duration: 213 ms, rate: 4928 MB/s
duration: 223 ms, rate: 4700 MB/s
jerch@linpad:~/xterm-wasm-parts$ node lib/base64/Base64Decoder.wasm.js
duration: 220 ms, rate: 4767 MB/s
duration: 199 ms, rate: 5265 MB/s
duration: 234 ms, rate: 4472 MB/s

Copy link
Member

@jerch jerch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, have some comments left below.

this._controlLength += copyLength;

if (!this._inControlData) {
// Found semicolon - stream remaining as payload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to parse the control data here already, this way you can cancel invalid sequences early, e.g. a sequence that does not expect payload can be finished/aborted here already (depends on what the spec says about malformed sequences).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Added d44900f spec only seem to call out i and I explicitly so currently all other invalid value would be ignored, or fallback to base case atm.

Maybe we could be more opinionated + explicit and handle them explicitly like i and I?

Copy link
Member

@jerch jerch Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I hate sloppy specs 👿

I basically had the same issue with IIP, in an older version the byteSize was mandatory and excess data would invalidate the sequence (this was a good and concise spec), but got watered to optional in a newer version of iTerm2. Currently IIP is still on that older stricter spec, but we might need to change that too, as people use the newer sequences more often now.

If you ask me - I am more on the strict side of things regarding this. I would abort on any malformed sequence (e.g. invalid keys, payload in a sequence that doesn't expect payload) as it greatly lowers the chance, that malicious data enters the terminal. Ofc additional limitations to the base spec need to be clearly stated in the docs.

Comment on lines 195 to 196
const chunk = new Uint8Array(this._decoder.data8);
this._decodedChunks.push(chunk);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe use the blob interface here? That would save precious memory in the JS context. But has the downside of being async on retrieval (thus works only, when the final payload is only used in end with a promise as return value).

Comment on lines 309 to 313
return this._handleTransmit(cmd, bytes, decodeError);
case KittyAction.TRANSMIT_DISPLAY:
return this._handleTransmitDisplay(cmd, bytes, decodeError);
case KittyAction.QUERY:
return this._handleQuery(cmd, bytes, decodeError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of these follow-up actions store the bytes for later usage, then you must make a copy of them. Currently the image data gets passed on borrowed from the base64 decoder in end, which is optimal for actions that finalize the image processing before end returns (things like new ImageData or createImageBitmap or new Blob already copy the data itself).

Specifically _handleTransmit looks fishy to me, as it stores the data in your this._images map. Easy fix - just replace it with:

...
return this._handleTransmit(cmd, new Uint8Array(bytes), decodeError);
...

Haven't checked the other 2 code paths _handleTransmitDisplay and _handleQuery.

Copy link
Contributor Author

@anthonykim1 anthonykim1 Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow.. Didnt even realize how that underlying memory could be reused later by decoder for different image, thank you this is very helpful!

Done here --> 596ab84

Copy link
Member

@jerch jerch Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is the idea of keepSize - to reuse the same memory later on to lower GC pressure. But with this borrow pattern things need to be tracked carefully. It works well, if the final action copies or transforms the data anyway in sync (e.g. new ImageData(bytes)).

I see that you also did it for _handleTransmitDisplay - is that really needed? Does the kitty protocol demand you to hold the original bytes?

@jerch
Copy link
Member

jerch commented Feb 5, 2026

@anthonykim1 Question about your image map:

private _images: Map<number, IKittyImageData> = new Map();

This seems to store kitty image data outside of the ImageStorage. We should avoid double storing data. Furthermore we have to abide to a very strict memory handling here, which ImageStorage tries to do. (half of the code is about proper image eviction...)

Suggestion - check if we can reuse & extend ImageStorage for what is missing for the kitty protocol. If that is futile and you truly need a separate data storage, then we need to offload the data somehow (e.g. by using blobs). Note, that we have only 2GB in Javascript, that will be exhausted in a few seconds, if someone tries to play a movie via the kitty protocol. 🎦

@anthonykim1
Copy link
Contributor Author

anthonykim1 commented Feb 6, 2026

@anthonykim1 Question about your image map:
...
This seems to store kitty image data outside of the ImageStorage. We should avoid double storing data.
Furthermore we have to abide to a very strict memory handling here, which ImageStorage tries to do. (half of the
code is about proper image eviction...)

Suggestion - check if we can reuse & extend ImageStorage for what is missing for the kitty protocol. If that is futile
and you truly need a separate data storage, then we need to offload the data somehow (e.g. by using blobs). Note, > that we have only 2GB in Javascript, that will be exhausted in a few seconds, if someone tries to play a movie via
the kitty protocol. 🎦

@jerch Yeah, this is a challenging one. I don't think we can discard raw data all the time since we won't be able to re-display at different size, a=p (display previously transmitted), etc.

I'll play around with the blob offloading,

Flow would be like:
storing raw image data as blob and then on display, decode from blob -> imageBitmap -> pass to Imagestorage?
This way ImageStorage would handle eviction for rendered bitmaps, while blob would persist for re-rendering at different size, etc.

And then the delete command a=d would clean up blob. I should also clear on clear screen, transmitting with same id, etc

@jerch
Copy link
Member

jerch commented Feb 6, 2026

@jerch Yeah, this is a challenging one. I don't think we can discard raw data all the time since we won't be able to re-display at different size, a=p (display previously transmitted), etc.

This is already done by the ImageStorage - it stores the original asset along with a maybe resized one:

export interface IImageSpec {
orig: HTMLCanvasElement | ImageBitmap | undefined;
origCellSize: ICellSize;
actual: HTMLCanvasElement | ImageBitmap | undefined;
actualCellSize: ICellSize;
marker: IMarker | undefined;
tileCount: number;
bufferType: 'alternate' | 'normal';
}

The resized one (actual... entries) kicks in, when you do a font resize on the terminal, e.g. the cell size changed. Resizing always happens from the orig one to not add up artifacts from previous resizes.

Flow would be like:
storing raw image data as blob and then on display, decode from blob -> imageBitmap -> pass to Imagestorage?
This way ImageStorage would handle eviction for rendered bitmaps, while blob would persist for re-rendering at different size, etc.

And then the delete command a=d would clean up blob. I should also clear on clear screen, transmitting with same id, etc

You are reinventing what is already there. ImageStorage also allows to delete an image by an id. If the protocol insists of providing ids from outside, then all you need is an outsideId --> storageId map and delegate the delete to ImageStorage. Explicit deletion is currently not exposed by ImageStorage as the other protocols have no means to to do that explicitly.

Explicit later resizing by the protocol introduces several issues, esp. around cell alignment and how to evict the previous tiles rendered. This prolly needs a clever separate method on ImageStorage similar to addImage.

Edit:
I just looked up the kitty protocol - hmm, those placements from a single source need some clarification. They introduce an indirection between loading image data and actual displaying. The image storage cannot do that at the moment. Also its eviction rules are solely based on the fact, whether image tiles are actually visible (invisible deleted first, then in ascending id progression deleting older ones next). So we'd have to change that, hmm.

Finally, you can specify the image z-index, i.e. the vertical stacking order. Images placed in the same location with different z-index values will be blended if they are semi-transparent. You can specify z-index values using the z key. Negative z-index values mean that the images will be drawn under the text.

Eww, that is not possible at all currently, as the ImageRenderer renders always on top. The negative z-index would prolly be possible with the webgl renderer, not sure if we ever get this working with the DOM renderer. The additional layering on top needs render composition, which is a TODO for the ImageRenderer.

So yeah after you sorted out the storage needs, the render side of things will get even more funny due to technical limitations.

@jerch
Copy link
Member

jerch commented Feb 6, 2026

@anthonykim1 I think we should create a discussion about the kitty protocol listing all its features and discuss first, what can be implemented and how, and what not. It is so loaded with features, that working into a certain direction might negate other aspects, if not considered from the beginning.

@anthonykim1
Copy link
Contributor Author

anthonykim1 commented Feb 7, 2026

@jerch Good idea! --> #5683

I'll try to list out the components as sub-items/lists to the discussion, their status, etc tomorrow. I also think its easier for us to talk about if I can number the chart/list and reference them individually (This PR is getting big and there's alot to KGP! )

I'd hate to miss out on any valuable feedbacks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants