Antonino Furnari

PREGO: online mistake detection in PRocedural EGOcentric videos

BibTeX Citation


                              @article{Plini2026TI-PREGO,
  author = { Leonardo Plini and Luca Scofano and Edoardo De Matteis and Guido Maria D’Amely di Melendugno and Alessandro Flaborea and Andrea Sanchietti and Giovanni Maria Farinella and Fabio Galasso and Antonino Furnari },
  journal = { Computer Vision and Image Understanding },
  title = { TI-PREGO: Chain of Thought and In-Context Learning for online mistake detection in PRocedural EGOcentric videos },
  year = { 2026 },
  url = { https://www.sciencedirect.com/science/article/pii/S1077314225003364 },
  pdf = { publications/Plini2026TIPREGO.pdf },
  doi = { https://doi.org/10.1016/j.cviu.2025.104613 },
  pages = {  },
}

Conference Version 2024

Integrating Affordances and Attention models for Short-Term Object Interaction Anticipation

journal 2026 🏆 2nd Place Ego-Exo4D Procedure Understanding Challenge 2025

Lorenzo Mur-Labadia , Ruben Martinez-Cantin , Jose J. Guerrero , Giovanni Maria Farinella , Antonino Furnari

IEEE Transactions on Pattern Analysis and Machine Intelligence

AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation

BibTeX Citation


                              @article{MurLabadia2026-Integrating,
  pdf = { publications/Mur_Labadia2026Integrating.pdf },
  url = { https://ieeexplore.ieee.org/document/11344783 },
  pages = { 1-17 },
  number = {  },
  year = { 2026 },
  doi = { http://10.1109/TPAMI.2026.3652831 },
  title = { Integrating Affordances and Attention models for Short-Term Object Interaction Anticipation },
  journal = { IEEE Transactions on Pattern Analysis and Machine Intelligence },
  author = { Lorenzo Mur-Labadia and Ruben Martinez-Cantin and Jose J. Guerrero and Giovanni Maria Farinella and Antonino Furnari },
}

Conference Version 2024

Ego-EXTRA: video-language Egocentric Dataset for EXpert-TRAinee assistance

conference 2026

Francesco Ragusa , Michele Mazzamuto , Rosario Forte , Irene D'Ambra , James Fort , Jakob Engel , Antonino Furnari , Giovanni Maria Farinella

IEEE Winter Conference on Application of Computer Vision (WACV)

Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory

BibTeX Citation


                              @inproceedings{Ragusa2026Ego-EXTRA,
  year = { 2026 },
  booktitle = { IEEE Winter Conference on Application of Computer Vision (WACV) },
  title = { Ego-EXTRA: video-language Egocentric Dataset for EXpert-TRAinee assistance },
  author = { Francesco Ragusa and Michele Mazzamuto and Rosario Forte and Irene D'Ambra and James Fort and Jakob Engel and Antonino Furnari and Giovanni Maria Farinella },
  pdf = { https://arxiv.org/pdf/2512.13238 },
}

conference 2026

Zaira Manigrasso , Matteo Dunnhofer , Antonino Furnari , Moritz Nottebaum , Antonio Finocchiaro , Davide Marana , Rosario Forte , Giovanni Maria Farinella , Christian Micheloni

IEEE Winter Conference on Application of Computer Vision (WACV)

BibTeX Citation


                              @inproceedings{Manigrasso2026Online,
  year = { 2026 },
  booktitle = { IEEE Winter Conference on Application of Computer Vision (WACV) },
  title = { Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory },
  author = { Zaira Manigrasso and Matteo Dunnhofer and Antonino Furnari and Moritz Nottebaum and Antonio Finocchiaro and Davide Marana and Rosario Forte and Giovanni Maria Farinella and Christian Micheloni },
  pdf = {  },
}

conference 2026

Exploring Multimodal LMMs for Online Episodic Memory Question Answering on the Edge

Giuseppe Lando , Rosario Forte , Antonino Furnari

International Conference on Computer Vision Theory and Applications (VISAPP)

arXiv PDF

BibTeX Citation


                              @inproceedings{forte2026exploring,
  title={Exploring Multimodal LMMs for Online Episodic Memory Question Answering on the Edge},
  author={Giuseppe Lando and Rosario Forte and Antonino Furnari},
  booktitle={International Conference on Computer Vision Theory and Applications (VISAPP)},
  year={2026},
  url={https://arxiv.org/abs/2602.22455},
  pdf={https://arxiv.org/pdf/2602.22455}
}

preprint 2026

Ego-METAS: an Egocentric online Multimodal Energy-efficient Temporal Action Segmentation benchmark

Maria Santos-Villafranca , Jesus Bermudez-Cameo , Alejandro Perez-Yus , Giovanni Maria Farinella , Antonino Furnari

arXiv preprint arXiv:2606.02246

Website arXiv Data

BibTeX Citation


                              @article{santosvillafranca2026egometas,
  title={Ego-METAS: an Egocentric online Multimodal Energy-efficient Temporal Action Segmentation benchmark},
  author={Santos-Villafranca, Maria and Bermudez-Cameo, Jesus and Perez-Yus, Alejandro and Farinella, Giovanni Maria and Furnari, Antonino},
  journal={arXiv preprint},
  year={2026},
  arxiv={2606.02246}
}

preprint 2026

EGOSTREAM: A Diagnostic Benchmark for Streaming Episodic Memory in Egocentric Vision

Rosario Forte , Giuseppe Lando , Antonino Furnari

arXiv preprint arXiv:2605.31557

arXiv PDF Website

BibTeX Citation


                              @article{forte2026egostream,
  title={EGOSTREAM: A Diagnostic Benchmark for Streaming Episodic Memory in Egocentric Vision},
  author={Forte, Rosario and Lando, Giuseppe and Furnari, Antonino},
  journal={arXiv preprint arXiv:2605.31557},
  year={2026}
}

preprint 2026

RECIPE: Procedural Planning via Grounding in Instructional Video

Luigi Seminara , Antonino Furnari , Lorenzo Torresani

arXiv preprint arXiv:2605.19976

arXiv PDF Website

BibTeX Citation


                              @article{seminara2026recipe,
  title={RECIPE: Procedural Planning via Grounding in Instructional Video},
  author={Seminara, Luigi and Furnari, Antonino and Torresani, Lorenzo},
  journal={arXiv preprint arXiv:2605.19976},
  year={2026}
}

2026

conference 2026

Retrieval-Augmented Online Textual Memory for Episodic Memory Video Question Answering

Raffaele Calì , Giuseppe Lando , Rosario Forte , Antonino Furnari

International Conference on Content-Based Multimedia Indexing (CBMI)

Conference

BibTeX Citation


                              @inproceedings{Cali2026Retrieval,
  author    = {Raffaele Cal{\`i} and Giuseppe Lando and Rosario Forte and Antonino Furnari},
  title     = {Retrieval-Augmented Online Textual Memory for Episodic Memory Video Question Answering},
  booktitle = {International Conference on Content-Based Multimedia Indexing (CBMI)},
  year      = {2026}
}

preprint 2026

SLU-2K: A Question-Based Benchmark for Semantic Evaluation of Sign Language Translation

Zeno Testa , Antonino Furnari , Lorenzo Baraldi , Natalia Díaz-Rodríguez

arXiv preprint arXiv:2606.03788

arXiv PDF Code

BibTeX Citation


                              @article{testa2026slu2k,
  title={SLU-2K: A Question-Based Benchmark for Semantic Evaluation of Sign Language Translation},
  author={Testa, Zeno and Furnari, Antonino and Baraldi, Lorenzo and D\'iaz-Rodr\'iguez, Natalia},
  journal={arXiv preprint arXiv:2606.03788},
  year={2026},
  arxiv={2606.03788}
}

journal 2025

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Kristen Grauman , Andrew Westbury , Lorenzo Torresani , Kris Kitani , Jitendra Malik , Triantafyllos Afouras , Kumar Ashutosh , Vijay Baiyya , Siddhant Bansal , Bikram Boote , Eugene Byrne , Zach Chavis , Joya Chen , Feng Cheng , Fu-Jen Chu , Sean Crane , Avijit Dasgupta , Jing Dong , Maria Escobar , Cristhian Forigua , Abrham Gebreselasie , Sanjay Haresh , Jing Huang , Md Mohaiminul Islam , Suyog Jain , Rawal Khirodkar , Devansh Kukreja , Kevin J. Liang , Jia-Wei Liu , Sagnik Majumder , Yongsen Mao , Miguel Martin , Effrosyni Mavroudi , Tushar Nagarajan , Francesco Ragusa , Santhosh Kumar Ramakrishnan , Luigi Seminara , Arjun Somayazulu , Yale Song , Shan Su , Zihui Xue , Edward Zhang , Jinxu Zhang , Angela Castillo , Changan Chen , Xinzhu Fu , Ryosuke Furuta , Cristina González , Prince Gupta , Jiabo Hu , Yifei Huang , Yiming Huang , Weslie Khoo , Anush Kumar , Robert Kuo , Sach Lakhavani , Miao Liu , Mi Luo , Zhengyi Luo , Brighid Meredith , Austin Miller , Oluwatumininu Oguntola , Xiaqing Pan , Penny Peng , Shraman Pramanick , Merey Ramazanova , Fiona Ryan , Wei Shan , Kiran Somasundaram , Chenan Song , Audrey Southerland , Masatoshi Tateno , Huiyu Wang , Yuchen Wang , Takuma Yagi , Mingfei Yan , Xitong Yang , Zecheng Yu , Shengxin Cindy Zha , Chen Zhao , Ziwei Zhao , Zhifan Zhu , Jeff Zhuo , Pablo Arbeláez , Gedas Bertasius , David Crandall , Dima Damen , Jakob Engel , Giovanni Maria Farinella , Antonino Furnari , Bernard Ghanem , Judy Hoffman , C. V. Jawahar , Richard Newcombe , Hyun Soo Park , James M. Rehg , Yoichi Sato , Manolis Savva , Jianbo Shi , Mike Zheng Shou , Michael Wray

International Journal of Computer Vision

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

BibTeX Citation


                              @article{Grauman2025,
  author    = {Kristen Grauman and Andrew Westbury and Lorenzo Torresani and Kris Kitani and Jitendra Malik and Triantafyllos Afouras and Kumar Ashutosh and Vijay Baiyya and Siddhant Bansal and Bikram Boote and Eugene Byrne and Zach Chavis and Joya Chen and Feng Cheng and Fu-Jen Chu and Sean Crane and Avijit Dasgupta and Jing Dong and Maria Escobar and Cristhian Forigua and Abrham Gebreselasie and Sanjay Haresh and Jing Huang and Md Mohaiminul Islam and Suyog Jain and Rawal Khirodkar and Devansh Kukreja and Kevin J. Liang and Jia-Wei Liu and Sagnik Majumder and Yongsen Mao and Miguel Martin and Effrosyni Mavroudi and Tushar Nagarajan and Francesco Ragusa and Santhosh Kumar Ramakrishnan and Luigi Seminara and Arjun Somayazulu and Yale Song and Shan Su and Zihui Xue and Edward Zhang and Jinxu Zhang and Angela Castillo and Changan Chen and Xinzhu Fu and Ryosuke Furuta and Cristina González and Prince Gupta and Jiabo Hu and Yifei Huang and Yiming Huang and Weslie Khoo and Anush Kumar and Robert Kuo and Sach Lakhavani and Miao Liu and Mi Luo and Zhengyi Luo and Brighid Meredith and Austin Miller and Oluwatumininu Oguntola and Xiaqing Pan and Penny Peng and Shraman Pramanick and Merey Ramazanova and Fiona Ryan and Wei Shan and Kiran Somasundaram and Chenan Song and Audrey Southerland and Masatoshi Tateno and Huiyu Wang and Yuchen Wang and Takuma Yagi and Mingfei Yan and Xitong Yang and Zecheng Yu and Shengxin Cindy Zha and Chen Zhao and Ziwei Zhao and Zhifan Zhu and Jeff Zhuo and Pablo Arbeláez and Gedas Bertasius and David Crandall and Dima Damen and Jakob Engel and Giovanni Maria Farinella and Antonino Furnari and Bernard Ghanem and Judy Hoffman and C. V. Jawahar and Richard Newcombe and Hyun Soo Park and James M. Rehg and Yoichi Sato and Manolis Savva and Jianbo Shi and Mike Zheng Shou and Michael Wray},
  title     = {Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives},
  journal   = {International Journal of Computer Vision},
  year      = {2025},
  month     = nov,
  day       = {24},
  volume    = {},
  number    = {},
  pages     = {},
  doi       = {10.1007/s11263-025-02557-6},
  url       = {https://doi.org/10.1007/s11263-025-02557-6},
  issn      = {1573-1405},
  pdf = {https://link.springer.com/content/pdf/10.1007/s11263-025-02557-6.pdf}
}

Conference Version 2024

Exocentric-to-Egocentric Adaptation for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs

journal 2025

Camillo Quattrocchi , Antonino Furnari , Daniele Di Mauro , Mario Valerio Giuffrida , Giovanni Maria Farinella

International Journal on Computer Vision (IJCV)

Project

BibTeX Citation


                              @article{quattrocchi2024synchronization,
  year = { 2025 },
  journal = { International Journal on Computer Vision (IJCV) },
  title = { Exocentric-to-Egocentric Adaptation for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs },
  author = { Camillo Quattrocchi and Antonino Furnari and Daniele Di Mauro and Mario Valerio Giuffrida and Giovanni Maria Farinella },
  url = { https://github.com/fpv-iplab/synchronization-is-all-you-need },
}

Conference Version 2024

Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs

EASG-Bench: Video Q&A Benchmark with Egocentric Action Scene Graphs

conference 2025

Ivan Rodin , Tz-Ying Wu , Kyle Min , Sharath Nittur Sridhar , Antonino Furnari , Subarna Tripathi , Giovanni Maria Farinella

IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

PDF arXiv

BibTeX Citation


                              @inproceedings{Rodin2025EASG-Bench,
  year = { 2025 },
  booktitle = { IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) },
  title = { EASG-Bench: Video Q&A Benchmark with Egocentric Action Scene Graphs },
  author = { Ivan Rodin and Tz-Ying Wu and Kyle Min and Sharath Nittur Sridhar and Antonino Furnari and Subarna Tripathi and Giovanni Maria Farinella },
  url = { https://arxiv.org/abs/2506.05787 },
  pdf = { https://arxiv.org/pdf/2506.05787.pdf },
}

conference 2025 🏆 Best Student Paper Award

How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering?

Giuseppe Lando , Rosario Forte , Giovanni Maria Farinella , Antonino Furnari

Proceedings of the 23rd International Conference on Image Analysis and Processing (ICIAP)

arXiv PDF

BibTeX Citation


                              @inproceedings{Lando2025HowFar,
  author    = {Giuseppe Lando and Rosario Forte and Giovanni Maria Farinella and Antonino Furnari},
  title     = {How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering?},
  booktitle = {Proceedings of the 23rd International Conference on Image Analysis and Processing (ICIAP)},
  year      = {2025}
}

journal 2024

An Outlook into the Future of Egocentric Vision

Chiara Plizzari , Gabriele Goletto , Antonino Furnari , Siddhant Bansal , Francesco Ragusa , Giovanni Maria Farinella , Dima Damen , Tatiana Tommasi

International Journal of Computer Vision (IJCV)

Ego4D: Around the World in 3,000 Hours of Egocentric Video

BibTeX Citation


                              @article{Plizzari2024AnOutlook,
  author = { Chiara Plizzari and Gabriele Goletto and Antonino Furnari and Siddhant Bansal and Francesco Ragusa and Giovanni Maria Farinella and Dima Damen and Tatiana Tommasi },
  journal = {  International Journal of Computer Vision (IJCV)  },
  title = {  An Outlook into the Future of Egocentric Vision  },
  year = {2024},
  pdf = {https://link.springer.com/content/pdf/10.1007/s11263-024-02095-7.pdf}
}

journal 2024 🏆 EgoVis 2022/2023 distinguished paper award

Kristen Grauman , Andrew Westbury , Eugene Byrne , Vincent Cartillier , Zachary Chavis , Antonino Furnari , Rohit Girdhar , Jackson Hamburger , Hao Jiang , Devansh Kukreja , Miao Liu , Xingyu Liu , Miguel Martin , Tushar Nagarajan , Ilija Radosavovic , Santhosh Kumar Ramakrishnan , Fiona Ryan , Jayant Sharma , Michael Wray , Mengmeng Xu , Eric Zhongcong Xu , Chen Zhao , Siddhant Bansal , Dhruv Batra , Sean Crane , Tien Do , Morrie Doulaty , Akshay Erapalli , Christoph Feichtenhofer , Adriano Fragomeni , Qichen Fu , Abrham Gebreselasie , Cristina Gonzalez , James Hillis , Xuhua Huang , Yifei Huang , Wenqi Jia , Weslie Khoo , Jachym Kolar , Satwik Kottur , Anurag Kumar , Federico Landini , Chao Li , Yanghao Li , Zhenqiang Li , Karttikeya Mangalam , Raghava Modhugu , Jonathan Munro , Tullie Murrell , Takumi Nishiyasu , Will Price , Paola Ruiz Puentes , Merey Ramazanova , Leda Sari , Kiran Somasundaram , Audrey Southerland , Yusuke Sugano , Ruijie Tao , Minh Vo , Yuchen Wang , Xindi Wu , Takuma Yagi , Ziwei Zhao , Yunyi Zhu , Pablo Arbelaez , David Crandall , Dima Damen , Giovanni Maria Farinella , Christian Fuegen , Bernard Ghanem , Vamsi Krishna Ithapu , C.V. Jawahar , Hanbyul Joo , Kris Kitani , Haizhou Li , Richard Newcombe , Aude Oliva , Hyun Soo Park , James M. Rehg , Yoichi Sato , Jianbo Shi , Mike Zheng Shou , Antonio Torralba , Lorenzo Torresani , Mingfei Yan , Jitendra Malik

IEEE Transactions on Pattern Analysis and Machine Intelligence

Action Scene Graphs for Long-Form Understanding of Egocentric Videos

BibTeX Citation


                              @ARTICLE{Grauman20241,
	author = {Grauman, Kristen and Westbury, Andrew and Byrne, Eugene and Cartillier, Vincent and Chavis, Zachary and Furnari, Antonino and Girdhar, Rohit and Hamburger, Jackson and Jiang, Hao and Kukreja, Devansh and Liu, Miao and Liu, Xingyu and Martin, Miguel and Nagarajan, Tushar and Radosavovic, Ilija and Ramakrishnan, Santhosh Kumar and Ryan, Fiona and Sharma, Jayant and Wray, Michael and Xu, Mengmeng and Xu, Eric Zhongcong and Zhao, Chen and Bansal, Siddhant and Batra, Dhruv and Crane, Sean and Do, Tien and Doulaty, Morrie and Erapalli, Akshay and Feichtenhofer, Christoph and Fragomeni, Adriano and Fu, Qichen and Gebreselasie, Abrham and Gonzalez, Cristina and Hillis, James and Huang, Xuhua and Huang, Yifei and Jia, Wenqi and Khoo, Weslie and Kolar, Jachym and Kottur, Satwik and Kumar, Anurag and Landini, Federico and Li, Chao and Li, Yanghao and Li, Zhenqiang and Mangalam, Karttikeya and Modhugu, Raghava and Munro, Jonathan and Murrell, Tullie and Nishiyasu, Takumi and Price, Will and Puentes, Paola Ruiz and Ramazanova, Merey and Sari, Leda and Somasundaram, Kiran and Southerland, Audrey and Sugano, Yusuke and Tao, Ruijie and Vo, Minh and Wang, Yuchen and Wu, Xindi and Yagi, Takuma and Zhao, Ziwei and Zhu, Yunyi and Arbelaez, Pablo and Crandall, David and Damen, Dima and Farinella, Giovanni Maria and Fuegen, Christian and Ghanem, Bernard and Ithapu, Vamsi Krishna and Jawahar, C.V. and Joo, Hanbyul and Kitani, Kris and Li, Haizhou and Newcombe, Richard and Oliva, Aude and Park, Hyun Soo and Rehg, James M. and Sato, Yoichi and Shi, Jianbo and Shou, Mike Zheng and Torralba, Antonio and Torresani, Lorenzo and Yan, Mingfei and Malik, Jitendra},
	title = {Ego4D: Around the World in 3,000 Hours of Egocentric Video},
	year = {2024},
  pdf = {https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10611736},
	journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
	pages = {1–32},
	doi = {10.1109/TPAMI.2024.3381075},
}

conference 2024

Ivan Rodin , Antonino Furnari , Kyle Min , Subarna Tripathi , Giovanni Maria Farinella

Conference on Computer Vision and Pattern Recognition (CVPR)

AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation

BibTeX Citation


                              @inproceedings{rodin2023action,
  primaryclass = { cs.CV },
  archiveprefix = { arXiv },
  eprint = { 2312.03391 },
  pdf = {https://arxiv.org/pdf/2312.03391.pdf},
  year = {2024},
  booktitle = {  Conference on Computer Vision and Pattern Recognition (CVPR)  },
  title = {Action Scene Graphs for Long-Form Understanding of Egocentric Videos},
  author = {Ivan Rodin and Antonino Furnari and Kyle Min and Subarna Tripathi and Giovanni Maria Farinella}
}

conference 2024 🏆 2nd Place Ego-Exo4D Procedure Understanding Challenge 2025

Lorenzo Mur-Labadia , Ruben Martinez-Cantin , Josechu Guerrero , Giovanni Maria Farinella , Antonino Furnari

European Conference on Computer Vision (ECCV)

Integrating Affordances and Attention models for Short-Term Object Interaction Anticipation

BibTeX Citation


                              @inproceedings{mur-labadia2024AFF-ttention,
  pdf = {https://arxiv.org/pdf/2406.01194.pdf},
  year = {2024},
  booktitle = { European Conference on Computer Vision (ECCV) },
  title = { AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation },
  author = { Lorenzo Mur-Labadia and Ruben Martinez-Cantin and Josechu Guerrero and Giovanni Maria Farinella and Antonino Furnari },
}

Journal Version 2026

Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos

conference 2024 🏆 EgoVis Distinguished Paper Award 2024/2025 🏆 Highlight Top 2% 🏆 1st Place Ego-Exo4D Procedure Understanding Challenge 2025

Luigi Seminara , Giovanni Maria Farinella , Antonino Furnari

Advances in Neural Information Processing Systems

PDF Code

BibTeX Citation


                              @inproceedings{seminara2024differentiable,
 author = {Seminara, Luigi and Farinella, Giovanni Maria and Furnari, Antonino},
 booktitle = {Advances in Neural Information Processing Systems},
 title = {Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos},
 pdf = {https://arxiv.org/pdf/2406.01486.pdf},
 url = {https://github.com/fpv-iplab/Differentiable-Task-Graph-Learning},
 year = {2024}
}

Journal Version 2026

Task Graph Maximum Likelihood Estimation for Procedural Activity Understanding in Egocentric Videos

Code

conference 2024 🏆 EgoVis Distinguished Paper Award 2024/2025 🏆 Oral Top 1%

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Kristen Grauman , Andrew Westbury , Lorenzo Torresani , Kris Kitani , Jitendra Malik , Triantafyllos Afouras , Kumar Ashutosh , Vijay Baiyya , Siddhant Bansal , Bikram Boote , Eugene Byrne , Zach Chavis , Joya Chen , Feng Cheng , Fu-Jen Chu , Sean Crane , Avijit Dasgupta , Jing Dong , Maria Escobar , Cristhian Forigua , Abrham Gebreselasie , Sanjay Haresh , Jing Huang , Md Mohaiminul Islam , Suyog Jain , Rawal Khirodkar , Devansh Kukreja , Kevin J Liang , Jia-Wei Liu , Sagnik Majumder , Yongsen Mao , Miguel Martin , Effrosyni Mavroudi , Tushar Nagarajan , Francesco Ragusa , Santhosh Kumar Ramakrishnan , Luigi Seminara , Arjun Somayazulu , Yale Song , Shan Su , Zihui Xue , Edward Zhang , Jinxu Zhang , Angela Castillo , Changan Chen , Xinzhu Fu , Ryosuke Furuta , Cristina Gonzalez , Prince Gupta , Jiabo Hu , Yifei Huang , Yiming Huang , Weslie Khoo , Anush Kumar , Robert Kuo , Sach Lakhavani , Miao Liu , Mi Luo , Zhengyi Luo , Brighid Meredith , Austin Miller , Oluwatumininu Oguntola , Xiaqing Pan , Penny Peng , Shraman Pramanick , Merey Ramazanova , Fiona Ryan , Wei Shan , Kiran Somasundaram , Chenan Song , Audrey Southerland , Masatoshi Tateno , Huiyu Wang , Yuchen Wang , Takuma Yagi , Mingfei Yan , Xitong Yang , Zecheng Yu , Shengxin Cindy Zha , Chen Zhao , Ziwei Zhao , Zhifan Zhu , Jeff Zhuo , Pablo Arbelaez , Gedas Bertasius , David Crandall , Dima Damen , Jakob Engel , Giovanni Maria Farinella , Antonino Furnari , Bernard Ghanem , Judy Hoffman , C. V. Jawahar , Richard Newcombe , Hyun Soo Park , James M. Rehg , Yoichi Sato , Manolis Savva , Jianbo Shi , Mike Zheng Shou , Michael Wray

Conference on Computer Vision and Pattern Recognition (CVPR)

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

BibTeX Citation


                              @inproceedings{grauman2023egoexo4d,
  primaryclass = { cs.CV },
  archiveprefix = { arXiv },
  eprint = { 2311.18259 },
  pdf = {https://arxiv.org/pdf/2311.18259.pdf},
  url = {https://ego-exo4d-data.org/},
  year = {2024},
  booktitle = {  Conference on Computer Vision and Pattern Recognition (CVPR)  },
  title = { Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives },
  author = { Kristen Grauman and Andrew Westbury and Lorenzo Torresani and Kris Kitani and Jitendra Malik and Triantafyllos Afouras and Kumar Ashutosh and Vijay Baiyya and Siddhant Bansal and Bikram Boote and Eugene Byrne and Zach Chavis and Joya Chen and Feng Cheng and Fu-Jen Chu and Sean Crane and Avijit Dasgupta and Jing Dong and Maria Escobar and Cristhian Forigua and Abrham Gebreselasie and Sanjay Haresh and Jing Huang and Md Mohaiminul Islam and Suyog Jain and Rawal Khirodkar and Devansh Kukreja and Kevin J Liang and Jia-Wei Liu and Sagnik Majumder and Yongsen Mao and Miguel Martin and Effrosyni Mavroudi and Tushar Nagarajan and Francesco Ragusa and Santhosh Kumar Ramakrishnan and Luigi Seminara and Arjun Somayazulu and Yale Song and Shan Su and Zihui Xue and Edward Zhang and Jinxu Zhang and Angela Castillo and Changan Chen and Xinzhu Fu and Ryosuke Furuta and Cristina Gonzalez and Prince Gupta and Jiabo Hu and Yifei Huang and Yiming Huang and Weslie Khoo and Anush Kumar and Robert Kuo and Sach Lakhavani and Miao Liu and Mi Luo and Zhengyi Luo and Brighid Meredith and Austin Miller and Oluwatumininu Oguntola and Xiaqing Pan and Penny Peng and Shraman Pramanick and Merey Ramazanova and Fiona Ryan and Wei Shan and Kiran Somasundaram and Chenan Song and Audrey Southerland and Masatoshi Tateno and Huiyu Wang and Yuchen Wang and Takuma Yagi and Mingfei Yan and Xitong Yang and Zecheng Yu and Shengxin Cindy Zha and Chen Zhao and Ziwei Zhao and Zhifan Zhu and Jeff Zhuo and Pablo Arbelaez and Gedas Bertasius and David Crandall and Dima Damen and Jakob Engel and Giovanni Maria Farinella and Antonino Furnari and Bernard Ghanem and Judy Hoffman and C. V. Jawahar and Richard Newcombe and Hyun Soo Park and James M. Rehg and Yoichi Sato and Manolis Savva and Jianbo Shi and Mike Zheng Shou and Michael Wray },
}

Journal Version 2025

PREGO: online mistake detection in PRocedural EGOcentric videos

conference 2024

Alessandro Flaborea , Guido D'Amely , Leonardo Plini , Luca Scofano , Edoardo De Matteis , Antonino Furnari , Giovanni Maria Farinella , Fabio Galasso

Conference on Computer Vision and Pattern Recognition (CVPR)

TI-PREGO: Chain of Thought and In-Context Learning for online mistake detection in PRocedural EGOcentric videos

BibTeX Citation


                              @inproceedings{flaborea2024PREGO,
  year = {2024},
  booktitle = {  Conference on Computer Vision and Pattern Recognition (CVPR)  },
  title = {  PREGO: online mistake detection in PRocedural EGOcentric videos  },
  author = { Alessandro Flaborea and Guido D'Amely and Leonardo Plini and Luca Scofano and Edoardo De Matteis and Antonino Furnari and Giovanni Maria Farinella and Fabio Galasso },
  pdf={https://arxiv.org/pdf/2404.01933},
  url={https://github.com/aleflabo/PREGO?tab=readme-ov-file}

}

Journal Version 2026

Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs

conference 2024

Camillo Quattrocchi , Antonino Furnari , Daniele Di Mauro , Mario Valerio Giuffrida , Giovanni Maria Farinella

European Conference on Computer Vision (ECCV)

Exocentric-to-Egocentric Adaptation for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs

BibTeX Citation


                              @inproceedings{quattrocchi2024synchronization,
  pdf = {https://arxiv.org/pdf/2312.02638.pdf},
  year = {2024},
  booktitle = { European Conference on Computer Vision (ECCV) },
  title = { Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs },
  author = { Camillo Quattrocchi and Antonino Furnari and Daniele Di Mauro and Mario Valerio Giuffrida and Giovanni Maria Farinella },
  url = {https://github.com/fpv-iplab/synchronization-is-all-you-need}
}

Journal Version 2025

Project

conference 2022 🏆 Oral Top 1% 🏆 EgoVis 2022/2023 distinguished paper award

Around the World in 3,000 Hours of Egocentric Video

Kristen Grauman , Andrew Westbury , Eugene Byrne , Zachary Chavis , Antonino Furnari , Rohit Girdhar , Jackson Hamburger , Hao Jiang , Miao Liu , Xingyu Liu , Miguel Martin , Tushar Nagarajan , Ilija Radosavovic , Santhosh Kumar Ramakrishnan , Fiona Ryan , Jayant Sharma , Michael Wray , Mengmeng Xu , Eric Zhongcong Xu , Chen Zhao , Siddhant Bansal , Dhruv Batra , Vincent Cartillier , Sean Crane , Tien Do , Morrie Doulaty , Akshay Erapalli , Christoph Feichtenhofer , Adriano Fragomeni , Qichen Fu , Christian Fuegen , Abrham Gebreselasie , Cristina Gonzalez , James Hillis , Xuhua Huang , Yifei Huang , Wenqi Jia , Weslie Khoo , Jachym Kolar , Satwik Kottur , Anurag Kumar , Federico Landini , Chao Li , Yanghao Li , Zhenqiang Li , Karttikeya Mangalam , Raghava Modhugu , Jonathan Munro , Tullie Murrell , Takumi Nishiyasu , Will Price , Paola Ruiz Puentes , Merey Ramazanova , Leda Sari , Kiran Somasundaram , Audrey Southerland , Yusuke Sugano , Ruijie Tao , Minh Vo , Yuchen Wang , Xindi Wu , Takuma Yagi , Yunyi Zhu , Pablo Arbelaez , David Crandall , Dima Damen , Giovanni Maria Farinella , Bernard Ghanem , Vamsi Krishna Ithapu , C. V. Jawahar , Hanbyul Joo , Kris Kitani , Haizhou Li , Richard Newcombe , Aude Oliva , Hyun Soo Park , James M. Rehg , Yoichi Sato , Jianbo Shi , Mike Zheng Shou , Antonio Torralba , Lorenzo Torresani , Mingfei Yan , Jitendra Malik

IEEE/CVF International Conference on Computer Vision and Pattern Recognition

Ego4D: Around the World in 3,000 Hours of Egocentric Video

BibTeX Citation


                              @inproceedings{grauman2022around,
  author = { Kristen Grauman and Andrew Westbury and Eugene Byrne and Zachary Chavis and Antonino Furnari and Rohit Girdhar and Jackson Hamburger and Hao Jiang and Miao Liu and Xingyu Liu and Miguel Martin and Tushar Nagarajan and Ilija Radosavovic and Santhosh Kumar Ramakrishnan and Fiona Ryan and Jayant Sharma and Michael Wray and Mengmeng Xu and Eric Zhongcong Xu and Chen Zhao and Siddhant Bansal and Dhruv Batra and Vincent Cartillier and Sean Crane and Tien Do and Morrie Doulaty and Akshay Erapalli and Christoph Feichtenhofer and Adriano Fragomeni and Qichen Fu and Christian Fuegen and Abrham Gebreselasie and Cristina Gonzalez and James Hillis and Xuhua Huang and Yifei Huang and Wenqi Jia and Weslie Khoo and Jachym Kolar and Satwik Kottur and Anurag Kumar and Federico Landini and Chao Li and Yanghao Li and Zhenqiang Li and Karttikeya Mangalam and Raghava Modhugu and Jonathan Munro and Tullie Murrell and Takumi Nishiyasu and Will Price and Paola Ruiz Puentes and Merey Ramazanova and Leda Sari and Kiran Somasundaram and Audrey Southerland and Yusuke Sugano and Ruijie Tao and Minh Vo and Yuchen Wang and Xindi Wu and Takuma Yagi and Yunyi Zhu and Pablo Arbelaez and David Crandall and Dima Damen and Giovanni Maria Farinella and Bernard Ghanem and Vamsi Krishna Ithapu and C. V. Jawahar and Hanbyul Joo and Kris Kitani and Haizhou Li and Richard Newcombe and Aude Oliva and Hyun Soo Park and James M. Rehg and Yoichi Sato and Jianbo Shi and Mike Zheng Shou and Antonio Torralba and Lorenzo Torresani and Mingfei Yan and Jitendra Malik },
  title = {  Around the {W}orld in 3,000 {H}ours of {E}gocentric {V}ideo  },
  booktitle = {  IEEE/CVF International Conference on Computer Vision and Pattern Recognition  },
  year = {2022},
  pdf = { https://arxiv.org/pdf/2110.07058.pdf },
  url = { https://ego4d-data.org/ },
}

Journal Version 2024

Talks & Presentations

3 June 2026 Workshop Talk

Towards Always‑On Wearable AI That Perceives, Understands, and Assists

VITA Workshop @ CVPR 2026

3 June 2026 Workshop Co-organization

Co-organizing the EgoVis Workshop

CVPR 2026

3-7 June 2026 Oral & Poster Presentations

ViterbiPlanNet: Injecting Procedural Knowledge via Differentiable Viterbi for Planning in Instructional Videos

CVPR 2026 Workshops & Main Conference

June 3 2026 Poster Presentations

Extended Abstracts (Collab. with Other Groups)

CVPR 2026 Workshops

22 May 2026 Invited Talk

Towards Embodied Human-AI Symbiosis with Egocentric Vision

MaP Robotics, Vision, and Controls Talks — ETH Zurich

17 June 2025 Invited Talk

Towards an Embodied Understanding of Human Behaviour with Egocentric Vision

University of Pennsylvania - Summer GRASP Seminars

12 June 2025 Workshop Talk

Precognition in Egocentric Vision: From Short-Term Interactions to Long-Term Procedural Understanding

Precognition Workshop @ CVPR 2025

23 May 2025 Workshop Talk

Egocentric Vision: An Anticipatory Sensor for Wearable Robots - From Action Forecasting to Procedural Understanding

Enhancing human mobility: From computer vision-based motion tracking to wearable assistive robot control workshop @ ICRA 2025

28 Nov. 2024 Workshop Talk

Beyond atomic actions: towards long-form and procedural understanding of egocentric videos

Video Understanding Applications workshop @ BMVC 2024, Glagow, UK

24 Apr. 2024 Invited Talk

Democratizing the Access to AI through Egocentric Vision

Research Seminars, Master in Robotics, Graphics and Computer Vision - University of Zaragoza

17 Oct. 2024 Tutorial

EgoExo4D Overview

Aria Tutorial @ ECCV 2024, Milan, IT

10 April 2026 Invited Talk

From Perception to Partnership: A Path Toward Collaborative AI via Egocentric Vision

Bocconi University

20 Oct. 2025 Workshop Talk

Action-Centric Graphs for Procedural Video Understanding

ICCV 2025 Workshop on Scene Graphs and Graph Representation Learning

19 Oct. 2025 Workshop Talk

From Observation to Intervention: Mistake Detection and Procedural Assistance in Egocentric Vision

ICCV 2025 Workshop on AI-driven Skilled Activity Understanding, Assessment & Feedback Generation

16 Sept. 2025 Workshop Talk

Egocentric Vision in the Kitchen: From Interaction Prediction to Recipe‑Level Understanding

AICV4Food workshop @ ICIAP 2025

21 July 2025 Workshop Talk

Learning to Act like a Pro: Discovering Procedural Models of Expert Workflows for Assistance and Validation

AMBEATion Workshop

17 July 2025 Invited Talk

Learning to See the World from an Egocentric Perspective

Universidad Carlos III de Madrid (UC3M)

04 July 2025 Workshop Talk

Egocentric Vision as a Bridge for Human-AI Collaboration

Eyes Of The Future: Integrating Artificial Intelligence in Smart Eyewear (IAISE) workshop @ IJCNN 2025

21 June 2025 Workshop Talk

Egocentric Vision for Procedural Video Understanding

Egocentric Perception & Action for Robot Learning workshop at RSS 2025, Los Angeles, US