Skip to content

imrobintomar/H-Lens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

H-Lens

Bibliometric transparency dashboard — analyse any researcher's h-index, detect pivotal self-citations, mutual citation rings, and decompose index inflation across 15 free open databases.

Built as the implementation backbone for a two-study empirical research programme on pivotal self-citation rates (target journal: Scientometrics).


What it does

Feature Description
Author search Name search (OpenAlex), Name + Affiliation (ORCID registry), or direct ORCID iD lookup
Self-citation analysis Sandnes (2026) pivotal self-citation plot + PSCR score (binary, count, weighted)
Mutual citation detection "You cite me, I cite you" ring detection with Mutual Citation Score (MCS) per pair
Index decomposition h-index and i10-index with and without self-citations; SC inflation score
Multi-database enrichment Cross-references OpenAlex with PubMed/iCite (NIH), Europe PMC, Crossref, ORCID
Batch upload CSV/Excel upload for up to 500 authors; results exportable as CSV
Data quality warning Confidence-scored name-match recovery for papers with disambiguation failures

Quick start

1. Backend

cd H-index

# Install Python dependencies
pip install fastapi uvicorn pyalex habanero arxiv semanticscholar \
            biopython requests pandas networkx python-multipart

# Start the API server
PYTHONPATH=src uvicorn api.main:app --port 8000 --reload

API docs available at http://localhost:8000/docs

2. Frontend

cd frontend
npm install
npm run dev

Open http://localhost:5173


Databases used

# Database Role Cost
1 OpenAlex Primary: h-index, works, citations Free
2 PubMed / PMC Life Sciences full-text Free
3 ORCID Author identity anchor Free
4 Semantic Scholar CS/Eng validation Free
5 Crossref DOI bridge + references Free
6 arXiv Physics/CS full-text Free
7 Europe PMC EU-funded papers Free
8 CORE Social sciences OA Free
9 bioRxiv/medRxiv Preprint comparison Free
10 INSPIRE-HEP High-energy physics Free
11 DBLP CS conference papers Free
12 Lens.org Patent cross-citations Free
13 Unpaywall OA fallback retrieval Free
14 ROR Institution registry Free
15 Wikidata Prestige proxy Free
+ iCite / NIH Field-normalised RCR Free
+ OpenCitations Citation graph Free
+ Dimensions.ai Cross-validation Free (key)

Total cost: $0 - all databases are freely accessible for academic research use.


API endpoints

Method Endpoint Description
GET /search?q={name} Quick author name search
GET /search/orcid?name=X&affiliation=Y Precise ORCID lookup
GET /search/orcid?orcid_id=0000-... Direct ORCID iD lookup
GET /author/{id} Author overview + position stats
GET /author/{id}/self Sandnes plot + PSCR analysis
GET /author/{id}/mutual Mutual citation network
GET /author/{id}/decompose Index decomposition
GET /author/{id}/multi Multi-database comparison
GET /author/{id}/patents Patent self-citations (Lens.org)
POST /upload Batch CSV analysis

Research background

This tool implements the empirical infrastructure for:

  • Study 1 - Pivotal Self-Citation Rates Across Disciplines, Career Stages, and Demographics: A Large-Scale Empirical Analysis Using OpenAlex (target: Scientometrics)
  • Study 2 - Inside the Pivotal Self-Citation: A Citation Context Analysis of Why Authors Pivotally Self-Cite (target: Journal of Informetrics)

Both studies extend Sandnes (2026): Understanding the h-index with self-citation visualizations, Scientometrics, 131, 1979-1989.


Key concepts

Pivotal self-citation - a self-citation to an h-core paper where removing that citation would drop the paper below the h-threshold (making it critical to the h-index).

PSCR (Pivotal Self-Citation Rate) - proportion of h-core papers containing at least one pivotal self-citation.

MCS (Mutual Citation Score) - min(A cites B, B cites A) / max(A cites B, B cites A). Score of 1.0 = perfectly symmetric; flagged when MCS > 0.6 and mutual count >= 3.

RCR (Relative Citation Ratio) - NIH iCite field-normalised citation impact. RCR > 1.0 means above average for the field.


Environment variables

Variable Description Required
CORE_API_KEY CORE.ac.uk free API key Optional
S2_API_KEY Semantic Scholar API key Optional
LENS_API_KEY Lens.org free API key Optional
DIMENSIONS_API_KEY Dimensions.ai free API key Optional
NCBI_API_KEY PubMed higher rate limit Optional

All optional - the system works without any keys using public rate limits.


License

Code: MIT
Data: All source databases are CC0 or equivalent open licence.


Citation

If you use H-Lens in your research:

H-Lens: Bibliometric transparency dashboard for pivotal self-citation analysis.
Built for: Pivotal Self-Citation Rates Across Disciplines, Career Stages, and Demographics (2026).

About

Understand any researcher's h-index - self-citation patterns, mutual citation rings, and index decomposition across 15 open databases.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors