Crawl4AI 缓存系统和迁移指南
¥Crawl4AI Cache System and Migration Guide
概述
¥Overview
从 0.5.0 版本开始,Crawl4AI 引入了一个新的缓存系统,用更直观的CacheMode枚举。此更改简化了缓存控制并使行为更加可预测。
¥Starting from version 0.5.0, Crawl4AI introduces a new caching system that replaces the old boolean flags with a more intuitive CacheMode enum. This change simplifies cache control and makes the behavior more predictable.
旧方法与新方法
¥Old vs New Approach
旧方法(已弃用)
¥Old Way (Deprecated)
旧系统使用多个布尔标志:-bypass_cache :完全跳过缓存 -disable_cache :禁用所有缓存 -no_cache_read :不要从缓存中读取 -no_cache_write :不写入缓存
¥The old system used multiple boolean flags:
- bypass_cache: Skip cache entirely
- disable_cache: Disable all caching
- no_cache_read: Don't read from cache
- no_cache_write: Don't write to cache
新方法(推荐)
¥New Way (Recommended)
新系统采用单一CacheMode枚举:-CacheMode.ENABLED :正常缓存(读/写) -CacheMode.DISABLED :根本没有缓存 -CacheMode.READ_ONLY :仅从缓存中读取 -CacheMode.WRITE_ONLY :仅写入缓存 -CacheMode.BYPASS :此操作跳过缓存
¥The new system uses a single CacheMode enum:
- CacheMode.ENABLED: Normal caching (read/write)
- CacheMode.DISABLED: No caching at all
- CacheMode.READ_ONLY: Only read from cache
- CacheMode.WRITE_ONLY: Only write to cache
- CacheMode.BYPASS: Skip cache for this operation
迁移示例
¥Migration Example
旧代码(已弃用)
¥Old Code (Deprecated)
import asyncio
from crawl4ai import AsyncWebCrawler
async def use_proxy():
async with AsyncWebCrawler(verbose=True) as crawler:
result = await crawler.arun(
url="https://www.nbcnews.com/business",
bypass_cache=True # Old way
)
print(len(result.markdown))
async def main():
await use_proxy()
if __name__ == "__main__":
asyncio.run(main())
新代码(推荐)
¥New Code (Recommended)
import asyncio
from crawl4ai import AsyncWebCrawler, CacheMode
from crawl4ai.async_configs import CrawlerRunConfig
async def use_proxy():
# Use CacheMode in CrawlerRunConfig
config = CrawlerRunConfig(cache_mode=CacheMode.BYPASS)
async with AsyncWebCrawler(verbose=True) as crawler:
result = await crawler.arun(
url="https://www.nbcnews.com/business",
config=config # Pass the configuration object
)
print(len(result.markdown))
async def main():
await use_proxy()
if __name__ == "__main__":
asyncio.run(main())
常见的迁移模式
¥Common Migration Patterns
¥Old Flag
¥New Mode
| 旧旗帜 | 新模式 |
|---|---|
bypass_cache=True |
cache_mode=CacheMode.BYPASS |
disable_cache=True |
cache_mode=CacheMode.DISABLED |
no_cache_read=True |
cache_mode=CacheMode.WRITE_ONLY |
no_cache_write=True |
cache_mode=CacheMode.READ_ONLY |