Skip to content

Commit 8b603c8

Browse files
committed
Update version to 0.0.5, add URL decoding functionality for Google News articles, and enhance batch search capabilities with progress tracking.
1 parent 57ed2ed commit 8b603c8

File tree

6 files changed

+598
-45
lines changed

6 files changed

+598
-45
lines changed

README.md

Lines changed: 113 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,38 @@
11
# Google News API Client
22

3+
[![PyPI Downloads](https://static.pepy.tech/badge/google-news-api)](https://pepy.tech/projects/google-news-api)
4+
[![Python Version](https://img.shields.io/badge/python-3.9%2B-blue)](https://www.python.org/downloads/)
5+
[![PyPI Version](https://img.shields.io/pypi/v/google-news-api)](https://pypi.org/project/google-news-api/)
6+
37
A robust Python client library for the Google News RSS feed API that provides both synchronous and asynchronous implementations with built-in rate limiting, caching, and error handling.
48

59
## Features
610

711
- ✨ Comprehensive news search and retrieval functionality
12+
- Search by keywords with advanced filtering
13+
- Get top news by topic (WORLD, NATION, BUSINESS, TECHNOLOGY, etc.)
14+
- Batch search support for multiple queries
15+
- URL decoding for original article sources
816
- 🔄 Both synchronous and asynchronous APIs
9-
- 🕒 Advanced time-based search capabilities (date ranges and relative time)
10-
- 🚀 High performance with in-memory caching (TTL-based)
11-
- 🛡️ Built-in rate limiting with token bucket algorithm
12-
- 🔁 Automatic retries with exponential backoff
17+
- `GoogleNewsClient` for synchronous operations
18+
- `AsyncGoogleNewsClient` for async/await support
19+
- 🕒 Advanced time-based search capabilities
20+
- Date range filtering (after/before)
21+
- Relative time filtering (e.g., "1h", "24h", "7d")
22+
- Maximum 100 results for date-based searches
23+
- 🚀 High performance features
24+
- In-memory caching with configurable TTL
25+
- Built-in rate limiting with token bucket algorithm
26+
- Automatic retries with exponential backoff
27+
- Concurrent batch searches in async mode
1328
- 🌍 Multi-language and country support
14-
- 🛠️ Robust error handling and validation
29+
- ISO 639-1 language codes (e.g., "en", "fr", "de")
30+
- ISO 3166-1 country codes (e.g., "US", "GB", "DE")
31+
- Language-country combinations (e.g., "en-US", "fr-FR")
32+
- 🛡️ Robust error handling
33+
- Specific exceptions for different error scenarios
34+
- Detailed error messages with context
35+
- Graceful fallbacks and retries
1536
- 📦 Modern Python packaging with Poetry
1637

1738
## Requirements
@@ -55,29 +76,39 @@ client = GoogleNewsClient(
5576
)
5677

5778
try:
58-
# Get top news
59-
top_articles = client.top_news(max_results=3)
60-
for article in top_articles:
61-
print(f"Top News: {article['title']} - {article['source']}")
62-
79+
# Get top news by topic
80+
world_news = client.top_news(topic="WORLD", max_results=5)
81+
tech_news = client.top_news(topic="TECHNOLOGY", max_results=3)
82+
6383
# Search with date range
6484
date_articles = client.search(
6585
"Ukraine war",
6686
after="2024-01-01",
6787
before="2024-03-01",
6888
max_results=5
6989
)
70-
for article in date_articles:
71-
print(f"Recent News: {article['title']} - {article['published']}")
72-
90+
7391
# Search with relative time
7492
recent_articles = client.search(
7593
"climate change",
7694
when="24h", # Last 24 hours
7795
max_results=5
7896
)
79-
for article in recent_articles:
80-
print(f"Latest News: {article['title']} - {article['published']}")
97+
98+
# Batch search multiple queries
99+
batch_results = client.batch_search(
100+
queries=["AI", "machine learning", "deep learning"],
101+
when="7d", # Last 7 days
102+
max_results=3
103+
)
104+
105+
# Process results
106+
for topic, articles in batch_results.items():
107+
print(f"\nTop {topic} news:")
108+
for article in articles:
109+
print(f"- {article['title']} ({article['source']})")
110+
print(f" Published: {article['published']}")
111+
print(f" Summary: {article['summary'][:100]}...")
81112

82113
except Exception as e:
83114
print(f"An error occurred: {e}")
@@ -99,24 +130,76 @@ async def main():
99130
requests_per_minute=60
100131
) as client:
101132
# Fetch multiple news categories concurrently
102-
tech_news = await client.search("technology", max_results=3)
103-
science_news = await client.search("science", max_results=3)
133+
world_news = await client.top_news(topic="WORLD", max_results=3)
134+
tech_news = await client.top_news(topic="TECHNOLOGY", max_results=3)
135+
136+
# Batch search with concurrent execution
137+
batch_results = await client.batch_search(
138+
queries=["AI", "machine learning", "deep learning"],
139+
when="7d",
140+
max_results=3
141+
)
104142

105-
print(f"Found {len(tech_news)} technology articles")
106-
print(f"Found {len(science_news)} science articles")
143+
# Decode Google News URLs to original sources
144+
for topic, articles in batch_results.items():
145+
print(f"\nTop {topic} news:")
146+
for article in articles:
147+
original_url = await client.decode_url(article['link'])
148+
print(f"- {article['title']} ({article['source']})")
149+
print(f" Original URL: {original_url}")
107150

108151
if __name__ == "__main__":
109152
asyncio.run(main())
110153
```
111154

112155
## Configuration
113156

157+
The library provides extensive configuration options through the client initialization:
158+
114159
| Parameter | Description | Default | Example Values |
115160
|-----------|-------------|---------|----------------|
116-
| `language` | Two-letter language code (ISO 639-1) | `"en"` | `"es"`, `"fr"`, `"de"` |
117-
| `country` | Two-letter country code (ISO 3166-1) | `"US"` | `"GB"`, `"DE"`, `"JP"` |
118-
| `requests_per_minute` | Rate limit threshold | `60` | `30`, `100`, `120` |
119-
| `cache_ttl` | Cache duration in seconds | `300` | `600`, `1800`, `3600` |
161+
| `language` | Two-letter language code (ISO 639-1) or language-country format | `"en"` | `"en"`, `"fr"`, `"de"`, `"en-US"`, `"fr-FR"` |
162+
| `country` | Two-letter country code (ISO 3166-1 alpha-2) | `"US"` | `"US"`, `"GB"`, `"DE"`, `"JP"` |
163+
| `requests_per_minute` | Rate limit threshold for API requests | `60` | `30`, `100`, `120` |
164+
| `cache_ttl` | Cache duration in seconds for responses | `300` | `600`, `1800`, `3600` |
165+
166+
### Available Topics
167+
168+
The `top_news()` method supports the following topics:
169+
- `"WORLD"` - World news
170+
- `"NATION"` - National news
171+
- `"BUSINESS"` - Business news
172+
- `"TECHNOLOGY"` - Technology news
173+
- `"ENTERTAINMENT"` - Entertainment news
174+
- `"SPORTS"` - Sports news
175+
- `"SCIENCE"` - Science news
176+
- `"HEALTH"` - Health news
177+
178+
### Time-Based Search
179+
180+
The library supports two types of time-based search:
181+
182+
1. **Date Range Search**
183+
- Use `after` and `before` parameters
184+
- Format: `YYYY-MM-DD`
185+
- Maximum 100 results
186+
- Example: `after="2024-01-01", before="2024-03-01"`
187+
188+
2. **Relative Time Search**
189+
- Use the `when` parameter
190+
- Hours: `"1h"` to `"101h"`
191+
- Days: Any number of days (e.g., `"7d"`, `"30d"`)
192+
- Cannot be used with `after`/`before`
193+
- Example: `when="24h"` for last 24 hours
194+
195+
### Article Structure
196+
197+
Each article in the results contains the following fields:
198+
- `title`: Article title
199+
- `link`: Google News article URL
200+
- `published`: Publication date and time
201+
- `summary`: Article summary/description
202+
- `source`: News source name
120203

121204
## Error Handling
122205

@@ -185,6 +268,9 @@ poetry run pytest
185268

186269
# Run tests with coverage
187270
poetry run pytest --cov=google_news_api
271+
272+
# Run pre-commit on all files
273+
pre-commit run --all-files
188274
```
189275

190276
## Contributing
@@ -205,6 +291,10 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
205291

206292
Paolo Mazza (mazzapaolo2019@gmail.com)
207293

294+
## Acknowledgments
295+
296+
- The URL decoding functionality is based on the work of [SSujitX/google-news-url-decoder](https://github.com/SSujitX/google-news-url-decoder)
297+
208298
## Support
209299

210300
For issues, feature requests, or questions:

google_news_api/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
from .logging import setup_logging
1414
from .utils import AsyncCache, AsyncRateLimiter, Cache, RateLimiter
1515

16-
__version__ = "0.0.4"
16+
__version__ = "0.0.5"
1717
__author__ = "Paolo Mazza"
1818
__email__ = "mazzapaolo2019@gmail.com"
1919

0 commit comments

Comments
 (0)