MatExt was developed by materials researchers together with experts in natural language processing (NLP). It is based on QA information extraction methodology and a text corpus developed through this collaboration in the national project DeeperMaterials.


Collaborating Groups


People


Acknowledgements

This tool was supported by:


Citation

Question Answering Models for Information Extraction from Perovskite Materials Science Literature
M. Sipilä, F. Mehryary, S. Pyysalo, F. Ginter, Milica Todorović
arXiv:2405.15290 (2024)

@article{Sipila2024,
  title = {Question Answering Models for Information Extraction from Perovskite Materials Science Literature},
  author = {M. Sipilä and F. Mehryary and S. Pyysalo and F. Ginter and Milica Todorović},
  journal = {arXiv},
  year = {2024},
  eprint = {2405.15290},
  archivePrefix = {arXiv},
  primaryClass = {cs.CL},
  url = {https://arxiv.org/abs/2405.15290}
}

Software libraries used

Software Library License
Flask, Flask-Caching, Flask-SQLAlchemy, Flask-HTTPAuth, Flask-WTF, WTForms, python-dotenv, Torch (PyTorch), ampq, billiard, cachelib, celery, click, httpcore, httpx, idna, itsdangerous, Jinja2, joblib, kombu, MarkupSafe, numpy, prompt_toolkit, python-dateutil, python-dotenv, vine, visitor, Werkzeug BSD-3-Clause
Chart.js, Bootstrap, SQLAlchemy, FastAPI, Pydantic, Uvicorn, aiohappyeyeballs, anyio, attrs, blinker, charset-normalizer, click-didyoumean, click-plugins, click-repl, dominate, exceptiongroup, greenlet, gunicorn, h11, redis, six, tqdm, urllib3, wcwidth MIT
Elasticsearch, elastic-transport, frozenlist, multidict, nltk, packaging, openai, regex, requests, sniffio, tzdata, yarl, aiohttp, aiosignals, async-timeout, requests, Datasets, Tokenizers, Accelerate, Transformers Apache License 2.0