@@ -131,6 +131,123 @@ hugr.query(
131131)
132132```
133133
134+ ## Streaming API
135+
136+ In addition to standard HTTP queries, ` hugr-client ` supports asynchronous streaming of data via WebSocket. This allows you to receive large datasets in batches or row-by-row, without waiting for the entire result to be loaded into memory.
137+
138+ ### Quick Start
139+
140+ ``` python
141+ import asyncio
142+ from hugr.stream import connect_stream
143+
144+ async def main ():
145+ client = connect_stream(" http://localhost:15001/ipc" )
146+
147+ # HTTP query for total count
148+ result = client.query(" query { devices_aggregation { _rows_count } }" )
149+ print (" Total devices:" , result.record()[' _rows_count' ])
150+
151+ # Stream data in batches (Arrow RecordBatch)
152+ async with await client.stream(
153+ """
154+ query {
155+ devices {
156+ id
157+ name
158+ geom
159+ }
160+ }
161+ """
162+ ) as stream:
163+ async for batch in stream.chunks():
164+ df = batch.to_pandas()
165+ print (" Batch:" , len (df), " rows" )
166+
167+ # Stream data row by row
168+ async with await client.stream(
169+ " query { devices { id name status } }"
170+ ) as stream:
171+ async for row in stream.rows():
172+ print (row)
173+
174+ asyncio.run(main())
175+ ```
176+
177+ ### Main Features
178+
179+ - ** connect_stream** — create a streaming client (WebSocket).
180+ - ** client.stream(query, variables=None)** — asynchronously get a stream of Arrow RecordBatch for a GraphQL query.
181+ - ** stream.chunks()** — async generator for batches (RecordBatch).
182+ - ** stream.rows()** — async generator for rows (dict).
183+ - ** stream.to_pandas()** — collect all streamed data into a pandas.DataFrame.
184+ - ** stream.count()** — count the number of rows in the stream.
185+ - ** stream_data_object(data_object, fields, variables=None)** — stream a specific data object and fields.
186+
187+ ### Example: Collect DataFrame via Streaming
188+
189+ ``` python
190+ import asyncio
191+ from hugr.stream import connect_stream
192+
193+ async def main ():
194+ client = connect_stream(" http://localhost:15001/ipc" )
195+ async with await client.stream(
196+ " query { devices { id name geom } }"
197+ ) as stream:
198+ df = await stream.to_pandas()
199+ print (df.head())
200+
201+ asyncio.run(main())
202+ ```
203+
204+ ### Example: Row-by-row Processing
205+
206+ ``` python
207+ import asyncio
208+ from hugr.stream import connect_stream
209+
210+ async def main ():
211+ client = connect_stream()
212+ async with await client.stream(
213+ " query { devices { id name status } }"
214+ ) as stream:
215+ async for row in stream.rows():
216+ if row.get(" status" ) == " active" :
217+ print (" Active device:" , row[" name" ])
218+
219+ asyncio.run(main())
220+ ```
221+
222+ ### Example: Query Cancellation
223+
224+ ``` python
225+ import asyncio
226+ from hugr.stream import connect_stream
227+
228+ async def main ():
229+ client = connect_stream()
230+ async with await client.stream(
231+ " query { devices { id name } }"
232+ ) as stream:
233+ count = 0
234+ async for batch in stream.chunks():
235+ count += batch.num_rows
236+ if count > 1000 :
237+ await client.cancel_current_query()
238+ break
239+
240+ asyncio.run(main())
241+ ```
242+
243+ ### Notes
244+
245+ - All streaming functions are asynchronous and require ` async ` /` await ` .
246+ - Dependencies: ` websockets ` , ` pyarrow ` , ` pandas ` .
247+ - You can use both a pure streaming client and an enhanced client with HTTP and WebSocket support.
248+
249+ See more in [ hugr/stream.py] ( hugr/stream.py ) and the code examples in the source files.
250+
134251## License
135252
136253This project is licensed under the MIT License - see the [ LICENSE] ( LICENSE ) file for details.
0 commit comments