MatExt was developed by materials researchers together with experts in natural language processing (NLP). It is based on QA information extraction methodology and a text corpus developed through this collaboration in the national project DeeperMaterials.
This tool was supported by:
Question Answering Models for Information Extraction from Perovskite Materials Science Literature
M. Sipilä, F. Mehryary, S. Pyysalo, F. Ginter, Milica Todorović
arXiv:2405.15290 (2024)
@article{Sipila2024, title = {Question Answering Models for Information Extraction from Perovskite Materials Science Literature}, author = {M. Sipilä and F. Mehryary and S. Pyysalo and F. Ginter and Milica Todorović}, journal = {arXiv}, year = {2024}, eprint = {2405.15290}, archivePrefix = {arXiv}, primaryClass = {cs.CL}, url = {https://arxiv.org/abs/2405.15290} }
Software libraries used
Software Library | License |
---|---|
Flask, Flask-Caching, Flask-SQLAlchemy, Flask-HTTPAuth, Flask-WTF, WTForms, python-dotenv, Torch (PyTorch), ampq, billiard, cachelib, celery, click, httpcore, httpx, idna, itsdangerous, Jinja2, joblib, kombu, MarkupSafe, numpy, prompt_toolkit, python-dateutil, python-dotenv, vine, visitor, Werkzeug | BSD-3-Clause |
Chart.js, Bootstrap, SQLAlchemy, FastAPI, Pydantic, Uvicorn, aiohappyeyeballs, anyio, attrs, blinker, charset-normalizer, click-didyoumean, click-plugins, click-repl, dominate, exceptiongroup, greenlet, gunicorn, h11, redis, six, tqdm, urllib3, wcwidth | MIT |
Elasticsearch, elastic-transport, frozenlist, multidict, nltk, packaging, openai, regex, requests, sniffio, tzdata, yarl, aiohttp, aiosignals, async-timeout, requests, Datasets, Tokenizers, Accelerate, Transformers | Apache License 2.0 |