Automatic Patent Clustering using SOM and Bibliographic Coupling

Auteurs

  • Magali Rezende Gouvêa Meireles Pontifícia Universidade Católica de Minas Gerais http://orcid.org/0000-0001-6928-7132
  • Juan R. S. Carvalho Pontifical Catholic University of Minas Gerais
  • Zenilton K. G. do Patrocínio Júnior Pontifical Catholic University of Minas Gerais
  • Paulo E. M. Almeida Federal Center for Technological Education of Minas Gerais

Résumé

Patents are usually organized in classes generated by the offices responsible for patents protection, to create a useful format to the information retrieval process. The complexity of patent taxonomies is a challenge for the automation of patent classification. Beside this, the high numbers of subgroups makes the classification in deeper levels more difficult. This work proposes a method to cluster patents using Self Organizing Maps (SOM) networks and bibliographic coupling. To validate the proposed method, an empirical experiment used a patent database from a specific classification system. The obtained results show that patents clusters were successfully identified by SOM through their cited references, and that SOM results were similar to k-Means algorithm results to perform this task. This study can contribute to the development of the knowledge organization systems by evaluating the use of citation analysis in the automatic clustering of patents in a constrained knowledge domain, at the subgroup level of current patent classification systems.

Téléchargements

Les données relatives au téléchargement ne sont pas encore disponibles.

Biographie de l'auteur

Magali Rezende Gouvêa Meireles, Pontifícia Universidade Católica de Minas Gerais

Possui Doutorado em Ciência da Informação pela UFMG (2012), Mestrado em Tecnologia pelo CEFET-MG (1998), Especialização em Controle de Processos e Instrumentação Eletrônica pela UDESC (1991) e Graduação em Engenharia Elétrica pela UFMG (1986). É professora Adjunta IV do Instituto de Ciências Exatas e Informática da PUC Minas, onde leciona nos cursos de Sistemas de Informação e de Engenharia de Computação. Atualmente, é professora colaboradora do Programa de Pós-Graduação em Informática, membro do Colegiado de Coordenação Didática do curso Engenharia de Computação e Editora Chefe da Revista Abakós. Dentre as áreas de interesse de pesquisa, destacam-se Processos de Categorização, Sistemas de Informação e Inteligência Computacional Aplicada. Realizou estágio pós-doutoral na Faculdade de Ciência e Engenharia, da Queensland University of Technology, em Brisbane, na Austrália, como bolsista da CAPES (2013-2014), onde mantém atividades como pesquisadora colaboradora.

Références

Baeza-Yates, R. and Ribeiro-Neto, B. (2011). Modern information retrieval. (2nd.ed.). England: Pearson.

Borgman, C. L. and Furner, J. (2002). Scholarly communication and bibliometrics, Annual Review of Information Science and Technology, 36 (1), 2-72. [Google Scholar]

Chakrabarti, A. K; Dror, I. and Eakabuse, N. (1993). Interorganizational transfer of knowledge: an analysis of patent citations of a defense firm, IEEE Transactions on Engineering Management, 40 (1), 91-94. DOI:10.1109/PICMET.1991.183703

Croft, W. B., Metzler, D. and Strohman, T. (2010). Search Engines: Information Retrieval in Practice. Boston: Addison Wesley.

Engelsman, E. C. and Van Raan, A. F. J. (1994). A patent-based cartography of technology, Research Policy, 23(1), 1-26. DOI:10.1016/0048-7333(94)90024-8

Hall, B. H., Jaffe, A. B. and Trajtenberg, M. (2002). The NBER patent citations data file: lessons, insights and methodological tools. In A. B. Jaffe and M. Trajtenberg (Eds.), Patents, citations & innovations (pp. 403-459). Cambridge, MA, London:MIT Presss.

Haykin, S. (1994). Neural Networks: a comprehensive foundation. New Jersey: Prentice Hall.

He, Y. and Hui, S. C. (2001). PubSearch: a web citation-based retrieval system. Library hi tech, 19, 274-285. [Google Scholar]

Hjorland, B. (2002). Domain analysis in information science: eleven approaches – traditional as well as innovative, Journal of Documentation, 58, 422-462. [Google Scholar]

Jacob, E. (2004). Classification and categorization: a difference that makes a difference, Library Trends, 52(3), 515-540. [Google Scholar]

Kukolj, D. et al. (2012). Comparison of Algorithms for Patent Documents Clusterization. In: MIPRO Proceedings of the 35th International Convention, Opatija, Croatia, 995-997. [Google Scholar]

Lai, K-K. and Wu, S-J. (2005). Using the patent co-citation approach to establish a new patent classification system, Information Processing & Management: an International Journal, 41(2), 313-330. DOI: 10.1016/j.ipm.2003.11.004 [Google Scholar]

Li, X., Chen, H., Zhang, Z. and Li, J. (2007). Automatic patent classification using citation network information: an experimental study in nanotechnology, In: Proceedings of the seventh ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’07). Vancouver, Canada. DOI: 10.1145/1255175.1255262 [Google Scholar]

Liu, D-R. and Shih, M-J. (2011). Hybrid-patent classification based on patent-network analysis, Journal of the American Society for Information Science and Technology, 62(2), 246-256. DOI: 10.1002/asi.21459 [Google Scholar]

Meireles, M. R. G., Cendón, B. V. and Almeida, P. E. M. (2014). Bibliometric Knowledge Organization: A Domain Analytic Method Using Artificial Neural Networks, Knowledge Organization, 41(2), 145-159. [Google Scholar]

Meireles, M. R. G., Ferraro, G and Shlomo, G. (2016). Classification and information management for patent collections: a literature review and some research questions, Information Research, 21(1). [Google Scholar]

Morris, S. A., Wu, Z. and Yen, G. (2001). A SOM mapping technique for visualizing documents in a database. In: Proceedings of the International Joint Conference on Neural Network, Washington, D. C., 1914-1919. DOI: 10.1109/IJCNN.2001.938456 [Google Scholar]

Pfitzner D., Leibbrandt R. and Powers D. (2009). Characterization and evaluation of similarity measures for pairs of clusterings, Knowledge and Information Systems, 19, 361-394. DOI: 10.1007/s10115-008-0150-6 [Google Scholar]

Sapsalis, E., Van Pottelsberghe de la Potterie, B. and Navon, R. (2006). Academic versus industry patenting: an in-depth analysis of what determines patent value, Research Policy, 35 (10), 1631-1645. DOI:10.1016/j.respol.2006.09.014 [Google Scholar]

Smith, H. (2002). Automation of patent classification, World Patent Information, 24(4), 269-271. DOI:10.1016/S0172-2190(02)00067-4 [Google Scholar]

Tikk. D., Biró, G. and Törcsvári, A. (2008). A hierarchical Online Classifier for Patent Categorization, 244-267. [Google Scholar]

Trajtenberg, M. (1990). A penny for your quotes: patent citations and the value of innovations, The Rand Journal of Economics, 21(1), 172-187. [Google Scholar]

Widodo, A. and Budi I. (2011). Clustering Patent Document in the Field of ICT (Information & Communication Technology). In: International Conference on Semantic Technology and Information Retrieval, Putrajaya, Malaysia, 203-208. DOI:10.1109/STAIR.2011.5995789 [Google Scholar]

Téléchargements

Publiée

2017-03-12

Comment citer

Meireles, M. R. G., Carvalho, J. R. S., do Patrocínio Júnior, Z. K. G., & Almeida, P. E. M. (2017). Automatic Patent Clustering using SOM and Bibliographic Coupling. ISys - Brazilian Journal of Information Systems, 10(1), 06–18. Consulté à l’adresse https://seer.unirio.br/isys/article/view/5514

Numéro

Rubrique

ARTIGOS REGULARES