EKSTRAKSI DATA TERSTRUKTUR PADA HALAMAN WEB MENGGUNAKAN MOBILE AGENT BERBASIS ARSITEKTUR BDI

Authors

  • Ahmad Naswin Universitas Megarezky Makassar
  • Sulkifli Universitas Megarezky Makassar
  • Sitti Mawaddah Umar Universitas Megarezky Makassar

DOI:

https://doi.org/10.54314/zsd3hv83

Keywords:

web data extraction; mobile agent; BDI architecture; Simple Tree Matching; JADE

Abstract

Abstract: The growth of semi-structured and unstructured data on web pages has increased the need for efficient, autonomous, and structure-aware methods for structured data extraction. This study proposes a mobile agent approach based on the BDI (Belief-Desire-Intention) architecture as an alternative to conventional wrappers for structured web data extraction. The system was developed using the JADE (Java Agent Development Framework) platform and the Prometheus methodology, involving two main agents: AgenDownload, responsible for migration and retrieval of web page content, and AgenEkstraksi, which implements the Simple Tree Matching and MDR (Mining Data Regions) algorithms to identify and extract flat data records. Experiments were conducted on 40 web pages from Indonesian and English online bookstores. The results show an average recall of 73%, indicating the system capability to locate most target records, while the average precision of 15.59% indicates that further refinement is needed in post-extraction filtering. The mobile agent approach remains relevant in distributed network scenarios with bandwidth constraints, near-source processing requirements, and autonomous operation needs. This study contributes a multi-role agent system architecture that can be adapted for automated web data extraction.

 

Keywords: web data extraction; mobile agent; BDI architecture; Simple Tree Matching; JADE

 

Abstrak: Pertumbuhan data semi-terstruktur dan tidak terstruktur pada halaman web meningkatkan kebutuhan terhadap metode ekstraksi data terstruktur yang efisien, otonom, dan dapat beradaptasi terhadap variasi struktur HTML. Penelitian ini mengajukan pendekatan berbasis mobile agent dengan arsitektur BDI (Belief-Desire-Intention) sebagai alternatif terhadap wrapper konvensional dalam proses ekstraksi data terstruktur pada halaman web. Sistem dibangun menggunakan platform JADE (Java Agent Development Framework) dan metodologi Prometheus, melibatkan dua agen utama: AgenDownload yang bertugas melakukan migrasi dan pengambilan konten halaman web, serta AgenEkstraksi yang mengimplementasikan algoritma Simple Tree Matching dan MDR (Mining Data Regions) untuk mengidentifikasi serta mengekstrak flat data records. Pengujian dilakukan terhadap 40 halaman web toko buku daring berbahasa Indonesia dan Inggris. Hasil menunjukkan rata-rata recall sebesar 73%, yang mengindikasikan kemampuan sistem dalam menemukan sebagian besar target data, sedangkan rata-rata precision sebesar 15,59% menunjukkan perlunya penyempurnaan mekanisme pemfilteran hasil ekstraksi. Pendekatan mobile agent tetap relevan pada skenario jaringan terdistribusi dengan keterbatasan bandwidth, kebutuhan pemrosesan dekat sumber data, dan operasi otonom. Kontribusi penelitian ini adalah rancangan arsitektur sistem agen multi peran yang dapat diadaptasi untuk kebutuhan ekstraksi data web secara otomatis.

 

Kata Kunci: ekstraksi data web; mobile agent; arsitektur BDI; Simple Tree Matching; JADE

 

Downloads

Download data is not yet available.

References

B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2nd ed. Berlin, Heidelberg: Springer, 2011. doi: 10.1007/978-3-642-19460-3.

E. Ferrara, P. De Meo, G. Fiumara, and R. Baumgartner, “Web data extraction, applications and techniques: A survey,” Knowledge-Based Systems, vol. 70, pp. 301–323, 2014. doi: 10.1016/j.knosys.2014.07.007.

C.-H. Chang, M. Kayed, M. R. Girgis, and K. Shaalan, “A survey of web information extraction systems,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1411–1428, 2006. doi: 10.1109/TKDE.2006.152.

H. A. Sleiman and R. Corchuelo, “A survey on region extractors from web documents,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 9, pp. 1960–1981, 2013. doi: 10.1109/TKDE.2012.135.

D. Glez-Peña, A. Lourenço, H. López-Fernández, M. Reboiro-Jato, and F. Fdez-Riverola, “Web scraping technologies in an API world,” Briefings in Bioinformatics, vol. 15, no. 5, pp. 788–797, 2014. doi: 10.1093/bib/bbt026.

D. S. Milojicic, F. Douglis, Y. Paindaveine, R. Wheeler, and S. Zhou, “Process migration,” ACM Computing Surveys, vol. 32, no. 3, pp. 241–299, 2000. doi: 10.1145/367701.367728.

M. El Fissaoui, A. Beni-hssane, S. Ouhmad, and K. El Makkaoui, “A survey on mobile agent itinerary planning for information fusion in wireless sensor networks,” Archives of Computational Methods in Engineering, vol. 28, pp. 1323–1334, 2021. doi: 10.1007/s11831-020-09417-1.

A. Liutkevi?ius, N. Morkevi?ius, A. Ven?kauskas, and J. Toldinas, “Distributed agent-based orchestrator model for fog computing,” Sensors, vol. 22, no. 15, article 5894, 2022. doi: 10.3390/s22155894.

Y. Zhai and B. Liu, “Web data extraction based on Partial Tree Alignment,” in Proceedings of the 14th International Conference on World Wide Web (WWW 2005), Chiba, Japan, 2005, pp. 76–85. doi: 10.1145/1060745.1060761.

Y. Zhai and B. Liu, “Structured data extraction from the Web based on Partial Tree Alignment,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 12, pp. 1614–1628, 2006. doi: 10.1109/TKDE.2006.197.

S. K. Patnaik, C. N. Babu, and M. Bhave, “Intelligent and adaptive web data extraction system using convolutional and long short-term memory deep learning networks,” Big Data Mining and Analytics, vol. 4, no. 4, pp. 279–297, 2021. doi: 10.26599/BDMA.2021.9020012.

R. Sarkhel, B. Huang, C. Lockard, and P. Shiralkar, “Self-training for label-efficient information extraction from semi-structured web-pages,” Proceedings of the VLDB Endowment, vol. 16, no. 11, pp. 3098–3110, 2023. doi: 10.14778/3611479.3611511.

H. A. Sleiman and R. Corchuelo, “TEX: An efficient and effective unsupervised web information extractor,” Knowledge-Based Systems, vol. 39, pp. 109–123, 2013. doi: 10.1016/j.knosys.2012.10.009.

A. Fuggetta, G. P. Picco, and G. Vigna, “Understanding code mobility,” IEEE Transactions on Software Engineering, vol. 24, no. 5, pp. 342–361, 1998. doi: 10.1109/32.685258.

L. de Silva, F. Meneguzzi, and B. Logan, “BDI agent architectures: A survey,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), 2020, pp. 4914–4921. doi: 10.24963/ijcai.2020/684.

M. Wooldridge, An Introduction to MultiAgent Systems, 2nd ed. Chichester: Wiley, 2009.

L. Padgham and M. Winikoff, Developing Intelligent Agent Systems: A Practical Guide. Chichester: Wiley, 2004.

L. Padgham, J. Thangarajah, and M. Winikoff, “The Prometheus Design Tool: A conference management system case study,” in Agent-oriented software engineering VIII, Lecture Notes in Computer Science, vol. 4951. Berlin, Heidelberg: Springer, 2008, pp. 197–211.

F. Bellifemine, G. Caire, and D. Greenwood, Developing Multi-Agent Systems with JADE. Chichester: Wiley, 2007. doi: 10.1002/9780470058411.

F. Bergenti, E. Iotti, S. Monica, and A. Poggi, “Agent-oriented model-driven development for JADE with the JADEL programming language,” Computer Languages, Systems & Structures, vol. 50, pp. 142–158, 2017. doi: 10.1016/j.cl.2017.06.001.

Downloads

Published

2026-04-27

Issue

Section

Artikel

How to Cite

EKSTRAKSI DATA TERSTRUKTUR PADA HALAMAN WEB MENGGUNAKAN MOBILE AGENT BERBASIS ARSITEKTUR BDI. (2026). JOURNAL OF SCIENCE AND SOCIAL RESEARCH, 9(2), 1631-1638. https://doi.org/10.54314/zsd3hv83

Most read articles by the same author(s)

<< < 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 > >>