Hi, I am Varish!

I am a Knowledge Discovery Researcher in AI and Machine Learning Systems at the GE Global Research Center (GRC) in Niskayuna, NY. My work at GE GRC focuses on using Semantic Web technologies and Natural Language Processing to help incorporate Artificial Intelligence and analytics in tools for GE's businesses such as Healthcare, Power, Digital, their customers as well as initiatives such as the Digital Twin.

I received my Ph.D. (2015) and M.S. (2010) in Computer Science from the University of Maryland, Baltimore County (UMBC). At UMBC, I was a member of the Ebiquity Research Lab, where I worked under the guidance of my guru, Prof. Tim Finin. I also closely collaborated with Prof. Anupam Joshi. I completed my B.E. in Computer Engineering from the University of Mumbai in 2007.

My research interests include the Semantic Web, Linked Data, Information extraction/text analysis and Machine Learning. My projects have broadly focused on extracting information and adding semantics to unstructured or semi-structured data. My Ph.D. dissertation research focused on developing TABEL -- a domain independent and extensible framework for inferring the semantics of tables found on the web and in medical papers. I developed novel techniques to map column headers to classes, table cell values to entities and pair of columns to relations from a given ontology and a knowledge graph. I also developed a novel Semantic Message Passing scheme which incorporates semantics into message passing, to perform joint inference over a probabilistic graphical model of a table.

Research

My research interests include the Semantic Web, Linked Data, Information extraction/text analysis and Machine Learning. My research projects till date have broadly focused on extracting information and adding semantics to unstructured or semi-structured data.

1. From tables, to Linked Data: Tables are an integral part of documents, reports and Web pages, compactly encoding important information that can be difficult to express in text. Number of domains such as the the Web, healthcare and public policy use tables to encode important information. Yet, we do not have systems that can understand information encoded in tables. My PhD dissertation research focuses on developing a domain independent and extensible framework for inferring the semantics of tables and represent it as RDF Linked Data. Read more about this project here.

2. Extracting security vulnerabilities from web text : Information about security vulnerabilities, attacks is often found on the web in vulnerability databases such as NIST NVD, IBM X-Force, security blogs and even hacker forums & chat-rooms. This work focused on processing unstructured free text found on the web and detecting concepts that describe security vulnerabilities. I developed a concept extraction algorithm that used Wikitology and a taxonomy extracted from Wikipedia to detect security vulnerability concepts. Related publication can be found here.

3. Information Extraction from Windows Phone store app description : My research project during my internship at Microsoft Research (MSR) broadly focused on Information extraction from descriptions associated with apps in the Windows Phone store. A US patent application has been filed based on this work. My mentor at MSR was Dr. Eveylne Viegas.

4. Entity Disambiguation and Search : My internship project at Microsoft Bing focused on building a 'proof-of-concept' for disambiguating entities in ambiguous search queries as well as search results and re-organizing the results based on identified entities. Two disclosures were filed with Microsoft for patent consideration.

Publications

2017

Paul Cuddihy, Justin McHugh, Jenny Weisenberg Williams, Varish Mulwad, Kareem S. Aggour, "SemTK: An Ontology-first, Open Source Semantic Toolkit for Managing and Querying Knowledge Graphs", arXiv preprint arXiv:1710.11531, 2017. Download

2016

Sudip Mittal, Prajit Kumar Das, Varish Mulwad, Anupam Joshi, and Tim Finin, "CyberTwitter: Using Twitter to generate alerts for cybersecurity threats and vulnerabilities", In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA, 2016. Download

Piyush Nimbalkar, Varish Mulwad, Nikhil Puranik, Anupam Joshi, and Tim Finin, "Semantic Interpretation of Structured Log Files", In 17th IEEE International Conference on Information Reuse and Integration (IRI), Pittsburg, PA, USA, 2016. Download

Luis Tari, Varish Mulwad, Anna von Reden, "Interactive online learning for clinical entity recognition", In Proceedings of the Workshop on Human-In-the-Loop Data Analytics (HILDA '16), held at SIGMOD, San Francisco, California, 2016. Download

2015

Varish Mulwad, "TABEL - A Domain Independent and Extensible Framework for Inferring the Semantics of Tables", PhD Thesis, University of Maryland, Baltimore County, USA, 2015. Download

2014

Roberto Yus, Varish Mulwad, Tim Finin, and Eduardo Mena, "Infoboxer: Using Statistical and Semantic Knowledge to Help Create Wikipedia Infoboxes", In 13th Int. Semantic Web Conf. (ISWC 2014), Riva del Garda, Italy, 2014. Download

Varish Mulwad, Tim Finin and Anupam Joshi, "Interpreting Medical Tables as Linked Data to Generate Meta-Analysis Reports", 15th IEEE Int. Conf. on Information Reuse and Integration, August 2014. Download

2013

Varish Mulwad, Tim Finin and Anupam Joshi, "Semantic Message Passing for Generating Linked Data from Tables" In Proceedings of the 12th International Semantic Web Conference, Sydney, Australia, October 2013. Download

2012

Varish Mulwad, Tim Finin and Anupam Joshi, "A Domain Independent Framework for Extracting Linked Semantic Data from Tables", In Search Computing - Broadening Web Search, Stefano Ceri and Marco Brambilla (eds.), LNCS volume 7538, Springer, July 2012. Download

2011

Varish Mulwad, Tim Finin and Anupam Joshi, "Automatically Generating Government Linked Data from Tables", In working notes of the AAAI Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges, November 2011. Download

Varish Mulwad, "DC Proposal: Graphical Models and Probabilistic Reasoning for Generating Linked Data from Tables", Proceedings of L. Aroyo et al. (Eds.): ISWC 2011, Part II, LNCS 7032, pp. 317-324, Springer-Verlag Berlin Heidelberg 2011 [at Doctoral Consortium at the International Semantic Web Conference (ISWC) 2011; also accepted as a poster presentation at ISWC 2011] Download

Varish Mulwad, Tim Finin and Anupam Joshi, "Generating Linked Data by Inferring the Semantics of Tables", Proceedings of the First International Workshop on Searching and Integrating New Web Data Sources, Co-located with VLDB 2011, Seattle, September 2011. Download

Varish Mulwad, Wenjia Li, Anupam Joshi, Tim Finin and Krishnamurthy Viswanathan, "Extracting Information about Security Vulnerabilities from Web Text", Proceedings of the Web Intelligence for Information Security Workshop, in conjunction with WI:IAT 2011, 22-27 August, 2011, Lyon, France. Download

2010

Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi, "Using linked data to interpret tables", Proceedings of the the First International Workshop on Consuming Linked Data, held in conjunction with the Ninth International Semantic Web Conference, Shanghai, November 2010. Download

Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi, "T2LD: Interpreting and Representing Tables as Linked Data" (poster paper), Proceedings of the Poster and Demonstration Session at the 9th International Semantic Web Conference, CEUR Workshop Proceedings, Shanghai, November 2010. Download

Varish Mulwad, "T2LD - An automatic framework for extracting, interpreting and representing tables as Linked Data", Masters Thesis, University of Maryland, Baltimore County, USA, August 2010.Download

Zareen Syed, Tim Finin, Varish Mulwad, and Anupam Joshi, "Exploiting a Web of Semantic Data for Interpreting Tables", In proceedings of the Second Web Science Conference, Raleigh NC, 26-27 April 2010. Download

Proposals

Significant contributions to the National Science Foundation (NSF) proposal "EAGER: T2K: From Tables to Knowledge"; awarded ($200,000); PI:Dr. Anupam Joshi; CO-PI:Dr. Tim Finin.

Significant contributions to FFRDC Seed Grant proposal, "Supporting Situation-Aware Systems for Automated Information Sharing and Incident Response"; awarded ($50,000); PI: Dr. Zareen Syed (UMBC)

Patents

Evelyne Viegas, Varish Mulwad, Patrick Pantel, "Action Broker", United States Patent, 2017. link

Professional Academic Activities


Co-organized the Industrial Knowledge Graphs Workshop at the 2017 ACM Web Science Conf. link

President, UMBC ACM student chapter (2012 - 2013)

Program Committee:
-- Workshop on Knowledge Base Construction, Reasoning and Mining (2018)
-- In-Use track for Int. Semantic Web Conf. (2017)
-- In-Use track for Int. Semantic Web Conf. (2017)
-- Poster & Demo track for Int. Semantic Web Conf. (2016, 2017)
-- Poster & Demo track for Extended Semantic Web Conf. (2017)
-- Int. Workshop on Linked Data for Information Extraction (2014, 2015, 2016, 2017).
-- 28th AAAI Conf. on Artificial Intelligence (AAAI 2014).
-- 2nd Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2012).
-- Int. Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data (Know@LOD 2012 at ESWC 2012).

Reviewer:
-- External reviewer ACM conference for Human-Computer Interaction (CHI 2016).
-- Sub reviewer 30th AAAI Conf. on Artificial Intelligence (AAAI 2016).
-- IEEE Transactions on Knowledge and Data Engineering (TKDE) (2013)
-- VLDB journal's special issue on Structured, Social and Crowd-sourced Data (2012)
-- IEEE Intelligent Systems special issue on Linked Open Government Data (2011)

Honors and Awards


-- Strategic Initiative Award, GE Global Research (2015)
-- Above & Beyond Bronze Award, GE Global Research (2015)
-- NSF awards for travel to the 13th (2014), 12th (2013), 10th (2011) and 9th (2010) Int. Semantic Web Conf.
-- Best PhD research award, UMBC CSEE department's annual research review meet (2013).
-- First place (2012) and Third place (2011), Poster presentation competition, UMBC CSEE department's annual research review.
-- Outstanding Oral Presentation award, 33rd (2011), 32nd (2010) UMBC Graduate Research Conf.
-- Spot award in Ness Technologies for successful client demo of test automation prototype (2008).
-- Felicitated by Mastek CMD for the successful implementation of The Interview Scheduler Project (2007).
-- Selected as Best Student in the Computer Engineering Batch at the Atharva College of Engineering, University of Mumbai (2006-2007).
-- Second in School at the Secondary School Exams (2001).