mirror of
https://github.com/jlengrand/OpenGraphKt.git
synced 2026-03-10 08:31:23 +00:00
Feat/updates 10 25 (#42)
* Update dependency com.fleeksoft.ksoup:ksoup-network to v0.2.5 (#40) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update whole ksoup to 0.2.5 * Update dependency gradle to v8.14.3 (#37) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update dependency org.junit:junit-bom to v5.14.0 (#36) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update dependency gradle to v8.14.3 (#43) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update plugin org.jetbrains.kotlin.jvm to v2.2.20 (#35) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update ktor from scraper * Update settings * Update settings * Update gradle/actions action to v4.4.4 (#31) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Adds claude init * Upgrades to Java 24 --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
e38c968151
commit
12de34aa60
73
CLAUDE.md
Normal file
73
CLAUDE.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
OpenGraphKt is a minimalist Kotlin multiplatform library for parsing Open Graph protocol tags from HTML. It wraps Ksoup (a Kotlin port of JSoup) to extract and structure Open Graph metadata.
|
||||
|
||||
**Current Status**: Pre-alpha - Protocol implementation is complete for `og:` tags, but type system needs refinement.
|
||||
|
||||
## Project Structure
|
||||
|
||||
This is a multi-module Gradle project:
|
||||
|
||||
- `opengraphkt/` - Core library module (published to Maven Central as `fr.lengrand:opengraphkt`)
|
||||
- `demo/` - Local file parsing examples
|
||||
- `demo-remote/` - Remote URL parsing examples (see Main.kt for usage)
|
||||
- `scrape-test/` - Testing/scraping utilities
|
||||
|
||||
## Common Commands
|
||||
|
||||
### Build and Test
|
||||
```bash
|
||||
./gradlew build # Build all modules
|
||||
./gradlew test # Run all tests
|
||||
./gradlew :opengraphkt:test # Run tests for core library only
|
||||
```
|
||||
|
||||
### Code Coverage
|
||||
```bash
|
||||
./gradlew koverXmlReport # Generate XML coverage report
|
||||
./gradlew koverVerify # Verify coverage meets 70% minimum threshold
|
||||
```
|
||||
|
||||
### Publishing
|
||||
```bash
|
||||
./gradlew publishToMavenLocal # Publish to local Maven repo for testing
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Components
|
||||
|
||||
**Parser (`Parser.kt`)**: Main entry point that accepts multiple input types:
|
||||
- `parse(url: URL)` - Fetches and parses remote HTML
|
||||
- `parse(html: String)` - Parses raw HTML string
|
||||
- `parse(file: File)` - Parses local HTML file
|
||||
- `parse(document: Document)` - Parses Ksoup Document
|
||||
|
||||
The parser extracts `meta[property^=og:]` tags and builds structured data models.
|
||||
|
||||
**Data Models (`Models.kt`)**: Type-safe representations of Open Graph data:
|
||||
- `Data` - Main container with `isValid()` method checking required fields (title, type, image, url)
|
||||
- Base types: `Image`, `Video`, `Audio`
|
||||
- Content-specific types: `Article`, `Book`, `Profile`
|
||||
- Music types: `MusicSong`, `MusicAlbum`, `MusicPlaylist`, `MusicRadioStation`
|
||||
- Video types: `VideoMovie`, `VideoEpisode`
|
||||
|
||||
### Key Implementation Details
|
||||
|
||||
**Tag Grouping**: Tags are grouped by namespace (prefix before first colon) to handle structured properties like `og:image:width`, `og:image:height` that belong to the preceding `og:image` tag.
|
||||
|
||||
**Date Handling**: ISO 8601 datetime parsing with fallback for date-only formats (appends `T00:00:00Z`).
|
||||
|
||||
**Structured Property Association**: Images/Videos/Audio with their metadata (width, height, type, etc.) are associated by parsing sequential tags - each base tag (`og:image`) is paired with following attribute tags (`og:image:width`) until the next base tag.
|
||||
|
||||
## Development Notes
|
||||
|
||||
- **JVM Toolchain**: Java 24 (see `jvmToolchain(24)` in build files)
|
||||
- **Testing**: CI matrix tests on Java 17 and 23 via GitHub Actions
|
||||
- **Dependencies**: Core library uses Ksoup (v0.2.5) for HTML parsing and network requests
|
||||
- **Maven Coordinates**: Group `fr.lengrand`, artifact `opengraphkt`, currently at `0.1.2-SNAPSHOT`
|
||||
- **Code Coverage**: Kover plugin enforces 70% minimum coverage threshold
|
||||
Reference in New Issue
Block a user