Data advance preparation factors affecting results of sequence rule analysis in web log mining

Munk, Michal; Kapusta, Jozef; Švec, Peter; Turčáni, Milan

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Munk, Michal
dc.contributor.author	Kapusta, Jozef
dc.contributor.author	Švec, Peter
dc.contributor.author	Turčáni, Milan
dc.date.accessioned	2016-01-14T09:34:32Z
dc.date.available	2016-01-14T09:34:32Z
dc.date.issued	2010
dc.identifier.citation	E+M. Ekonomie a Management = Economics and Management. 2010, č. 4, s. 143-160.	cs
dc.identifier.issn	1212-3609 (Print)
dc.identifier.issn	2336-5604 (Online)
dc.identifier.uri	http://www.ekonomie-management.cz/download/1331826744_e1b0/12_munk.pdf
dc.identifier.uri	http://hdl.handle.net/11025/17373
dc.format	18 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Technická univerzita v Liberci	cs
dc.relation.ispartofseries	E+M. Ekonomie a Management = Economics and Management	cs
dc.rights	© Technická univerzita v Liberci	cs
dc.rights	CC BY-NC 4.0	cs
dc.subject	web log mining	cs
dc.subject	příprava dat	cs
dc.subject	hodnocení kvality dat	cs
dc.subject	analýza sekvenčního pravidla	cs
dc.subject	vzory	cs
dc.title	Data advance preparation factors affecting results of sequence rule analysis in web log mining	en
dc.type	článek	cs
dc.type	article	en
dc.rights.access	openAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	One of the main tasks of web log mining is discovering patterns of behaviour of portal visitors. Based on the found patterns of users behaviour, which are represented by sequence rules it is possible to modify and improve the web page of an organisation. This article aims at finding out by means of an experiment to what degree it is necessary to realize data preparation for web log mi- ning and it aims also at specifying inevitable steps for obtaining valid data from the log file. Results of the experiment are very important for the portal, which is regularly analysed and modified, since they can prove correctness of individual steps at analysis, or through an identification of “usele- ss” steps they can make the advance preparation of data simpler. These results show that data cleaning from crawlers accesses has a significant impact on the quantity of extracted rules only in case, when we use the method of paths completion. On the contrary, the impact on the reduction of the portion of inexplicable rules as well as the impact on the quality of extracted rules in terms of their basic characteristics was not proved. Paths completing was proved crucial in data prepa- ration for web log mining. It was proved that paths completing has a significant impact both on the quantity and the quality of extracted rules. However, it was prov ed that allowing the used browser upon identifying sessions has neither any significant impact on the quantity nor on the quality of extracted rules. There exist a number of models for identification of users sessions, which are cru- cial in data preparation, however, there e xists also a method, which identifies them expressly. Our next goal is to additionally programme this functionality into the existing system and analyse various parameters of individual methods of identification of sessions compared with the reference direct identification. It also mentions the necessity to pay attention to the analysis of web logs in the real time and to reduce the time needed for the advance preparation of these logs and at the same time to increase accuracy of these data depending on the time of their collection.	en
dc.subject.translated	web log mining	en
dc.subject.translated	data preparation	en
dc.subject.translated	data quality assessment	en
dc.subject.translated	sequence rule analysis	en
dc.subject.translated	patterns	en
dc.type.status	Peer-reviewed	en
Vyskytuje se v kolekcích:	Číslo 4 (2010) Číslo 4 (2010)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
12_munk.pdf	Plný text	472,41 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/17373

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace