Research Article Volume 6 Issue 5
1Institute of Biomedical Engineering, College of Electrical and Computer Engineering, National Chiao Tung University, Taiwan
2Himax Technologies, Inc., Taiwan
3Department of Computer Science, National Chiao Tung University, Taiwan
Correspondence: Tzu-Chien Hsiao, Institute of Biomedical Engineering, College of Electrical and Computer Engineering, National Chiao Tung University
Received: October 24, 2017 | Published: November 28, 2017
Citation: Wu YM, Lee PM, Hsiao TC. Searching for target audience based on facebook demographics. Biom Biostat Int J. 2017;6(5):427-442. DOI: 10.15406/bbij.2017.06.00179
Advertising is one kind of professional field from the evolution of media and technology. With the rise of Internet technology, advertising development also tend to two-way communication to the current development trend. At this point, expert of marketing “Kotler” proposed target marketing theory in 1994 years. This theory divides consumers into groups according to different characteristics, and then provides advertising for the group of consumers based on their characteristics. This theory has led to a new direction for two-way communication. Also in response to the rise of Web 2.0 technology, consumers can respond to specific advertisement (advertisers) by clicking.
The purpose of this study is to use the popularity of Facebook and combined with the properties of community network advertising. Designed to target marketing as a framework, the machine learning is used to construct the two-way possibility between advertising and consumers based on the marketing mix data provided by the Facebook. This study can be seen as an optimization of the selection of groups. Therefore, in addition to the use of decision trees and regression rules, this study imports an XCS that can builds a knowledge base.
Under the approval of Research Ethics Committee for Human Subject Protection (REC) in National Chiao Tung University, this experiment was conducted in the Facebook for advertising and collecting data. First, this study made an advertisement of “Free web design teaching course (贈送網頁設計教學課程)”,and selected 943 groups of advertising mix (marketing mix) for further advertising. Second, export the data reports provided by the FB, the data encoding and modeling, and the results of the comparison. XCS can cover unnecessary features, on the whole, Facebook platform with XCS there are some information to help advertising, may be able to provide new ideas for advertising research.
Background
Advertising: American Marketing Association (AMA) defines the tern of advertising as “The placement of announcements and persuasive messages in time or space purchased in any of the mass media by business firms, nonprofit organizations, government agencies, and individuals who seek to inform and/ or persuade members of a particular target market or audience about their products, services, organizations, or ideas”. Before the 1890s, the word “advertising” may only represent a business promotion. With the birth of more media and technology, such as printing technology, we start the generation of print advertising. For example, newspaper. This period of advertising not only began to identify the image of the business, but also provide consumers to remember the goods information. And print advertising also make communication of advertising began to rise. After the 20th century, the factory became more and more automated. Factory began to mass production of a variety of goods, so businessmen need to open up sales for these commodities. In this time, the first generation of electronic media, “radio” began to spread. Manufacturers began to sell their products to consumers through this new media, prompting the birth of the modern advertising industry. Modern advertising is no longer purely a propaganda action, began to follow the market research to enhance advertising efficiency.
After the 1950s, television became an important advertising medium. Advertisers are not advertising for the product itself and begin to develop brand image logo and drama techniques. The ability to display (rather than practicality) becomes the most important and unique buying point. Stigler (1961), Telser (1964), Nelson (1974) have studied how to use the story to create a way to reduce the cost of search in advertising during this period. After 1971, due to the homogeneity of the product increases, coupled with waves of M&A (mergers and acquisitions) and globalization, commodity around the world can flow more accessible. Advertising in addition to the need for creativity, pay more attention to sales during this period. So the segmentation by life type can be substantially achieved sales results of advertising began to be studied. That is, the prototype of “Targeting”. But at this time, advertising is still stuck in the era of the one-way transmission of information.
In 1969 appeared in the Internet, in 1970 began to use e-mail, and the website came out in the early 1990s. But the computer media until the website can be put into advertising have not yet replaced other media. Until the Web 2.0 platform with the sharing of text, images and video appeared in 2004, really opened the era of two-way interaction. With the economic marketization process is accelerating, the international and domestic market environment has undergone dramatic changes. The end of the situation of seller’s market and the shortage of economic, ushered in the buyer’s market and surplus economy. However, the formation of the buyer’s market and the surplus economy led to the transfer of rights, that is, in the consumer and business transactions negotiations, the decline in the status of business negotiations, from corporate sovereignty into the consumer (customer) sovereignty. With the progress of advertising media, advertising psychology begins to challenge rational psychological operators. Early 20th century advertising was designed to attract the undivided consumer market, so the cost advantage of Fordism-style production for the company to create a lot of profit. This is the main reason why early product advertising must attract the mass consumer market with its broad practical value. In the 1990s, psychologists argued that advertising was not delivered to “single type” consumers, against the existence of homogeneous advertising audiences. In 1994, the expert of marketing Kotler formally incorporated Target marketing into marketing concepts (Figure 1).
All along, a new media does not regard the old media completely replaced. However, due to the Internet, enabling advertisers to light significant reduction in the cost of customer communication. Advertising is no longer focused on how to create the most popular advertisement, but the direct use of the characteristics of customers, so that the real “Targeting”. Therefore, like Xavier (2003), Iyer (2005), Jun (2009), etc., are devoted to the development of Targeting, such as the development of large data analysis methods. So, what extent Targeting can do, where it is a very interesting of the subject.
Interactive advertising
Because the network to bring the two-way communication, advertisers are concerned about the interaction between advertising and customers (Figure 2). The focus of advertising is no longer the product of the universal value, customer-oriented advertising to become the most important part. In 2000, a journal “Journal of Interactive Advertising” opened the development between interactive advertising and marketing. The Internet interaction is because of the vigorous development of web 2.0, so that advertising on the Internet more and more valuable (or even replace the traditional advertising). And web 2.0, such as community platforms, make it easier for consumers to leave valuable personal data on the web or even instant click through rate (Table 1). When developing an advertising strategy, advertisers can begin to get what the previous media cannot get through the Web 2.0 platform. Such as demographic information, customer service records, and even personal tastes and preferences, and even instant online behavior (viewing or clicking). Through these information, advertisers have a more scientific way to respond to customer needs. In other words, advertisers began to use web2.0 platform to observe their customers to help promote new products. With this capability comes the challenge to gather and use evolving data to optimize the delivery of marketing messages.
Web1.0 |
Web2.0 |
|
|
Table 1 Personal data on the web
Target marketing
In the highly interactive environment, whether it is marketing or advertising, they care about how the information through the consumer as much as possible to give the right service (example: personalized ads). When a marketer in an interactive environment decides which messages to send to marketer’s customers, marketer may send messages currently thought to be most promising or use poorly understood messages for the purpose of information gathering. Philip Kotler defines a target marketing as the seller identifies market segments, select one or more of them, and develops product and marketing mixes tailored to each (Figure 3).
Motivation
After reading and exploring the developing trends of advertising, interactive advertising is valuable in modern advertising. According to the Target Marketing theory, people are likely to be systematically classified and locked. So, in this study I have an idea, is it possible that everyone receive the most appropriate advertising content by their own characteristics in the future of this highly interactive world. Or is it possible for the new technology to help consumers build the characteristics of the online world and then receive the corresponding advertisements? From the advertisers want to make personalized advertising, derived from the consumer can take the initiative to help advertisers make more appropriate advertising!
Literature review
Interactive advertising model: Shelly Rodgers and Esther Thorson1 argued that the most basic thing in interactive advertising is to identify the formation of the ad in the internet environment, which is controlled by the advertiser or by the consumer. Before the Internet, advertisers can control when and how to advertising, of course, consumers can choose to accept or to ignore it. But in the era of the Internet development, the control process can already be transferred to and share with consumers. Researchers also confirmed that consumers are in control of the formation of advertising,2 and even thought that advertisers can not advertise from the consumer’s point of view and interact with consumers, advertising technology will go to the bottleneck.3 If it is considered that each consumer connect to the Internet for some purpose or a plan. It is possible to transfer the control of the advertisement from advertiser to the consumer if it can collect the information from consumer in the interactive advertising.4 And here will assume that interactive advertising is likely to make the customer influence what they see (Figure 4).
Therefore, when advertisers do advertising, advertisers can begin to get what the previous media cannot get, such as demographic information, customer service records, personal tastes and preferences, and even instant online behavior (Figure 5). Through these information from customer, advertisers have a more scientific way to respond to customer needs, customer can also begin to affect their own ads, that is, in interactive media, dynamic measurement is very important, so today’s advertising must to be possible to obtain the customer’s immediate behavior information.
Click: In 1996, Deighton5 pointed out that the interaction needs to meet two conditions, the first point is the ability to clearly address some kind of people, the second point is to be able to have the ability to remember which kind of person’s response after positioning. In 1997, the IAB Media Measurement Task Force defined a click as “an interaction with an advertisement”.6 In 2002, the Metric Clicks, being the total number of click in a certain period, tells advertiser that someone acted on related advertisement.7
What are the features in Interactive Advertising: The effectiveness of advertising is how to define, Graepel8 said the effectiveness of advertising measures may depend on the click-through rate. With click through rate, advertisers will be able to combine features of advertisement that may have high click through rate. Graepel summaries features of interactive advertisement into three categories, i.e. ad feature, query feature and context feature (Table 2). And how to filter out or even combine these features effectively to achieve high click-through rate, is everyone’s attention to the subject.
Ad features Query features |
bid phases, text, landing page URL |
Table 2 Features of advertisement
Richardson9 is building a process that explains how to use features of advertisement to predict click-through rates. First, segment the audience of advertising with customer’s characteristics. Once the customer clicks on the advertisement, the customer’s characteristics is collected into a database. Finally, using the collected customer’s characteristics to think about its advertising strategy.
Learning algorithm
Modeling: The goal is to predict the click of the advertising, which is used as a classification problem by predicting the click rate through a series of features.
Richardson et al.,9 use logistic regression (linear combination and logistic function) to establish the relationship between features of advertisement and click-through rates. Assuming that xi is the value of the i-th feature in the ad, the click (y) can be set to 0 or 1, then
There are a features in the advertisement, and wi represents the weight corresponding to the i-th feature in the advertisement. And features of advertisement can be demographic information, personal tastes and preferences.
Advertisers can interact with consumers due to the network. Researchers start involved in predicting the effectiveness of advertising, it can be seen in (Table 3).8,10-12 In this complex environment, our challenge will be to tap the part of the data that has not yet been excavated to develop the strategy of targeted customers. But in the regression, we cannot see the efficiency of a particular feature. We need a special clustering method to construct a knowledge base to develop a strategy for targeted customers.
Authors (Year) |
Algorithm or method |
Handle dis-continuous features |
Graepel T.8 |
Probit regression |
This paper presented a online learning algorithm used for CTR prediction in Bing 's sponsored search advertising. |
Dave K. S.10 |
Boosted trees |
This paper proposed an approach to predict the CTR for new ads based on the similarity with other ads/queries. |
Trofimov l.11 |
Boosted trees |
This paper presented a new approach to CTR prediction task for sponsored search — tdatrixNet machine learning algorithm. |
Cheng H.12 |
Logistic regression |
This paper propose to use multimedia features to improve the accuracy of click prediction for new ads in a NGD advertising system. |
Table 3 Predicting the effectiveness of advertising
And decision tree is multi-layered architecture and have special clustering to build a knowledge base. Decision tree can classify the known instance to create a tree structure and summarize the hidden rules between the category field and the other fields in instance. It has the opportunity to construct a knowledge base reveals hidden information (Figure 6).
Here this study proposes a method XCS not only multi-level architecture but also has the advantages of generalization. XCS using a special matched process to build a knowledge base and it has a characteristics of generalization to enhance the knowledge base. And generalization will increase the readability of the knowledge base (Figure 7).
Facebook advertisement platform (FbAP): Advertiser is considering an interactive technology of marketing and reaching targeted customer in an interactive media, such as advertising on Facebook. Using two main features of FbAP (Table 4),13 advertising on FbAP will increase authenticity of the information left by customers on the platform. Its core value lies in the link between real friends, with a variety of applications spread all kinds of messages. The connection between friends formed a network of relationships. Facebook has become a distribution system; the distribution mechanism is the customer’s personal connections. FbAP also has a mechanism for grouping ads, advertisers will be able to advertising to the specified marketing mix, so that we can do the first step target marketing: segmentation.
Component (Authors, year) |
Why use Facebook for marketing |
Fans page (Tuten;13 |
Consumers interact with brands through pages because they are looking for incentives and in order to build a relationship with a company. |
Social graph (Geminder;22 Laudon,23) |
The social graph is a free word-of-mouth system, which allows users to inform other users about their favorite products and services. |
Table 4 The major components on Facebook advertisement platform (FbAP)
Hypothesis: Assuming that Facebook member (virtual avatar) can be used to achieve targeting of physical world, advertising with Facebook advertisement platform and a learning algorithm make customer influencing their received message by interactive with advertiser directly (Figure 8).
Objective
The Internet media to promote each person can leave the information that can be recorded on the web. This makes the advertiser and consumer have the possibility of communication (Figure 9). The purpose of this study is to use the popularity of Facebook at this stage, combined with the characteristics of social networking advertising. Using Target marketing as the base, we use the machine learning rule to construct the two-way possibility between advertising and consumers based on the marketing mix data provided by Facebook. This study can be considered an optimization of the selection of groups.
Advertising for experiment
The whole study takes several parts to achieve (Figure 10), the first process is to take the report from FbAP, which will record the data from the advertisement set up in this study. The advertisement set up in this study “Free web design teaching course (贈送網頁設計教學課程)” (Figure 11) will get the 943 marketing mix. Each marketing mix will have its click or no click feedback; the second process is to extract the characteristics of these marketing mixes, in accordance with one click or no clicks, by logistic regression (including linear combination logistic function), a decision tree learning and XCS model, these characteristics are classified whether to click this advertisement.
System flow chart
In FbAP, due to considerations of time and cost, distinguishing all of marketing mixes is difficult., in this study only picks out the marketing mixes that more likely to click on the advertisement “Free web design teaching course (贈送網頁設計教學課程)”. Focus on whether these characteristics have the ability to identify whether the audience will click on the advertisement. Given the input data for the marketing mix of advertising, the output data for the click or non-click (Figure 12).
Marketing mix: In FbAp, advertiser follows the gender, age, user property, occupation, combined out a group of advertising audience clustering called marketing mix, each marketing mix as a input data for the research.
Click or no click: Each marketing mix record of whether this group clicked advertisement.
Classification
C4.5 (decision tree): Model built by using machine learning technique for modeling and determining the most useful attribute for class discrimination called C4.5.14 It selects the attribute with the highest information gained.
The expect information measures info from a group consisting of S data samples with m distinct classes by following equation:
Entropy is a measure of the uncertainty in a probability distribution which quantifies the value of the information contained in a data set. The equation (2) show the entropy of attribute A with values a1, a2,…, aj.
The quantity of information gain measures the information that is gained by branching on attribute A by equation (3).
The quantity measures the information that is gained by partitioning S with attributes. And then, selects the attribute of maximize information gain.
Extend Classifier Systems (XCS): Here is a brief description of the XCS. The XCS is designed for both single-step and multiple-step tasks, but in the current study we applied the XCS only to the single-step problems. The XCS is a system that continually making decisions and learn from the environment by getting rewards and evolving its rule set (called population set [P]). The [P] contains all the classifiers in XCS; each classifier has a condition, an action, and a set of associated parameters. The condition is a string {0, 1, #}; the action is a string {0, 1}. There are three parameters associated with a classifier. For a classifier clj, (1) payoff prediction pj, which calculate the reward the system will get according to the classifier whether matches and its action is chosen by the system; (2) prediction error εj, which calculates the error in pj from actual reward received; and (3) fitness Fj is computed from prediction error. The system architecture of the XCS is shown in Figure 13.15,16
1. Matching operation
2. Action processing
2-1 Fitness-weighted average of the
predictions of classifiers advocating ai.
2-2 Action selection
(largest prediction or roulette-wheel)
2-3 Action execution
2-4 External reward (rn)
3. Reinforcement learning
3-1 Updating [A]n (P → rn)
4. Update operation([P]n+1)
4-1 Rules in [P]n+1 correspond to [A]n will be updated
Figure 13 The system architecture of the XCS.
• The marketing mix as a subject.
• Each marketing mix will have its corresponding response (click-through rate).
Subject
There are 943 marketing mixes (Figure 14) based on age, gender, occupation and user property. There is only one output data, that is, click or non-click. The encoding method used is the Gray code and the advantage of this code is that the difference between adjacent two codes is minimized.
Targeting the right audience based on Facebook Advertisement reports were approved by the Research Ethics Committee for Human Subject Protection (REC) , National Chiao Tung University (NCTU-REC-104-031).
Encode: According to the above-mentioned coding mode, the input data is 1 + 4 + 4 + 4 = 13 bits (binary) length. Output data is relatively simple, there are clickers as 1, no clickers to 0,For example, a male (1), 46 years old (0101), 資訊科技與技術 (0001),使用Gmail 地址(0111) user who viewed the ad without any clicks, is initially encoded as:
As the input value are gender, age, occupation and user property (Table 5).
Gender category |
||
Class |
Gray code |
Description |
1 |
0 |
Male |
2 |
1 |
Female |
Age category |
||
Class |
Gray code |
Description |
1 |
0 |
25-27 |
2 |
1 |
28-30 |
3 |
11 |
31-34 |
4 |
10 |
35-38 |
5 |
110 |
39-41 |
6 |
111 |
42-44 |
7 |
101 |
45-47 |
8 |
100 |
47-50 |
9 |
1100 |
51-53 |
10 |
1101 |
54-56 |
11 |
1111 |
57-59 |
User property category |
||
Class |
Gray code |
Description |
1 |
0 |
Using Android device |
2 |
1 |
Use iOS device |
3 |
11 |
Use Gmail address |
4 |
10 |
Use Hotmail address |
5 |
110 |
Yahoo |
6 |
111 |
Early adoption of technology |
7 |
101 |
Facebook fan page administrator |
8 |
100 |
Photo upload tool |
9 |
1100 |
Late technology public |
10 |
1101 |
Small business owners |
Occupation category |
||
Class |
Gray code |
Description |
1 |
0 |
Business and Financial Operations 1 |
2 |
1 |
Arts / Entertainment / Sports / Media 2 |
3 |
11 |
Information Technology and Technology 2 |
4 |
10 |
Health and Medicine 2 |
5 |
110 |
Law 2 |
6 |
111 |
Science 2 |
7 |
101 |
Business 3 |
8 |
100 |
Transport and Handling 4 |
9 |
1100 |
Retail 5 |
Occupation category |
||
Class |
Gray code |
Description |
10 |
1101 |
Agriculture / Forestry / Fisheries 6 |
11 |
1111 |
Food preparation 7 |
12 |
1110 |
Construction and Engineering 7 |
13 |
1010 |
Installation and maintenance 7 |
14 |
1011 |
Manufacturing 8 |
15 |
1001 |
Temporary and seasonal 9 |
16 |
1000 |
Home cleaning and maintenance 9 |
Table 5 Coding of the characteristics of the advertisement
Republic of China occupation standards Classification
Category 1 “Representatives, Supervisors and Managers”
Category 2 “Professionals”
Category 3 “Technicians and Assistant Professionals”
Category 4 “Support Staff”
Category 5 “Service and Sales Staff”
Category 6 “Farmers, Forestry, Fisheries and Livestock Production Workers”
Category 7 “Art-related staff”
Category 8 “Machinery Operator and Assembler”
Category 9 “Soldiers” Category 0 “Basic Skilled Workers and Laborers”
Validation
This study uses 10-fold cross validation. Initially, the original sample is randomly partitioned into ten equal sized subsamples. Of the ten subsamples, a single subsample is retained as the validation data for testing the model, and the remaining nine subsamples are used as training data. The cross-validation process is then repeated ten times. The ten results from the folds can then be averaged to produce a single estimation (Figure 15).
This study uses Weka3.6.917 to implement C4.5 and regression modeling, and uses Java programming language to implement XCS modeling. here are the steps to write the XCS, 1. Matching: if marketing mix formed from environment matches the rule in [P], it will enter [M]. 2. Action selection: select the action with the largest prediction or select the action probabilistically. 3. Payoff function: Updating the p , ɛ, and F by training dataset(Reward Setting: reward = 1000, correct action ; 0, otherwise). 4. GA: triggered occasionally to search for accurate classifiers in the solution space, including crossover and mutation (Figure 16).
Test: When test rule is generated, there are a number of classifiers in [P] will match it. And it selects the classifier of maximize fitness-weighted average of the predictions.
In this advertising research data, 207 marketing mixes belong to the group that click the advertisement, 736 marketing mixes belong to the group that not click the advertisement. Due to the imbalance of the training data set, using C4.5 and the regression to modeling will tend to majority of class, that is, the ability to predict non-majority of class is very poor. In order not to lose the information of the marketing mixes that has clicked advertisement. Here this study use K-means to down sampling on the marketing mix that non-click (Figure 17).
In machine learning, precision and recall are often used to detect the model. “Precision” is the ratio of correct answers to all answers. “Recall” is the ratio of correct answers to correct answers from ground truth. In the case of the ball, as we have five black balls and four white balls, four black balls are successfully classified in the black ball, two white balls are successfully classified in the white ball, and then the precision of black ball is 4 / 4 + 2 = 2/3, and the recall is of black ball is 4/5.
The following sections will explore click or no click on the advertisement do have to rely on marketing mix to change.
Establish a model in the literature commonly used method
Using the logistic regression (Figure 18) and C4.5 (Figure 19) to build the model results are as follows: did not have high precision and recall. Indicating that the two methods did not become a good way to identify whether the marketing mix clicks or not clicks the advertisement. But it can be seen that both methods show that precision of no-click is higher and recall of click is higher.
Modeling with XCS
This study has developed a model XCS to predict the click or no-click of different advertising mixes. Try to compare with the commonly used C4.5 and regression in the literature. For creating a model, use the marketing mix collected from the Facebook platform during the experiment to detect clicks or not click.
Two kinds of test evaluation methods are used to judge the accuracy of the model. We divided the data into 10 parts. The first evaluation method was trained and tested with the same data, and 10 sets of data were run once for a test set, called self-consistency test evaluation, the second evaluation method runs as a test set for each group, while the remaining nine groups are used as training sets for building model, called 10-fold cross validation。The first type of evaluation method can be used to test its own model for its own forecasting performance, the second type of evaluation method is applied to the new data when the forecast performance. The (Table 6) is the result.
|
|
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
Average |
LR |
Self calibration |
0.788 |
0.75 |
0.726 |
0.716 |
0.734 |
0.727 |
0.727 |
0.759 |
0.753 |
0.727 |
0.666 |
|
Cross- Validation |
0.78 |
0.609 |
0.634 |
0.707 |
0.658 |
0.634 |
0.595 |
0.659 |
0.585 |
0.75 |
0.661 |
C4.5 |
Self calibration |
0.737 |
0.686 |
0.678 |
0.659 |
0.68 |
0.686 |
0.689 |
0.686 |
0.691 |
0.675 |
0.687 |
|
Cross- Validation |
0.634 |
0.634 |
0.707 |
0.878 |
0.683 |
0.634 |
0.619 |
0.634 |
0.585 |
0.727 |
0.673 |
XCS |
Self calibration |
0.97 |
0.97 |
0.973 |
0.962 |
0.983 |
0.967 |
0.967 |
0.959 |
0.968 |
0.97 |
0 969 |
|
Cross- Validation |
0.707 |
0.78 |
0.658 |
0.731 |
0.926 |
0.658 |
0.585 |
0.634 |
0.634 |
0.636 |
0.695 |
Table 6 The results of the three methods.
Software: This analysis is generated by SPSS, and determine whether the mean of the differences between two paired samples differs from 0. A paired t-test can be more powerful than a 2-sample t-test because the latter includes additional variation occurring from the independence of the observations.
Model comparison: The results of self-consistency test evaluation show in Table 7. The results of the table from left to right, first for the difference (XCS-LR) the average of 0.30343, the standard deviation is 0.23459 and the standard deviation error is 0. 07418.The 95% confidence interval for the difference is 0.13562 ~ 0.47124, do not contain ‘0’. Indicating that the XCS-LR model is significantly different. The same result is obtained from the p value. The difference (XCS-C4.5) the average of 0.2822, the standard deviation is 0.2041 and the standard deviation error is 0.00645. The 95% confidence interval for the difference is 0.2676~0.2968, do not contain ‘0’. Indicating that the XCS-C4.5 model is significantly different. The same result is obtained from the p value.
|
Paired Differences |
t |
df |
Sig. (2-tailed) |
||||
Mean |
Std. deviation |
Std. error mean |
95% Confidence Interval of the difference |
|||||
Lower |
Upper |
|||||||
Pair 1 XCS – LR |
0.30343 |
0.23459 |
0.07418 |
0.13562 |
0.47124 |
4.09 |
9 |
0.003 |
Pair 2 XCS - C4.5 |
0.28220 |
0.02041 |
0.00645 |
0.26760 |
0.29680 |
43.732 |
9 |
0.000 |
Table 7 The results of self-consistency test evaluation.
The results of 10-fold cross validation show in Table 8. The results of the table from left to right, first for the difference (XCS-LR) the average of 0.338, the standard deviation is 0.11196 and the standard deviation error is 0.03541. The 95% confidence interval for the difference is-0.04629 ~0.11389, contain ‘0’. Indicating that the XCS-LR model is not significantly different. The same result is obtained from the p value. The difference (XCS-C4.5) the average of 0.0214, the standard deviation is 0.11428 and the standard deviation error is 0.03614. The 95% confidence interval for the difference is -0.06035~0.10315, contain ‘0’. Indicating that the XCS-C4.5 model is not significantly different. The same result is obtained from the p value.
The results of 10-fold cross validation show in Table 8. The results of the table from left to right, first for the difference (XCS-LR) the average of 0.338, the standard deviation is 0.11196 and the standard deviation error is 0.03541. The 95% confidence interval for the difference is-0.04629 ~0.11389, contain ‘0’. Indicating that the XCS-LR model is not significantly different. The same result is obtained from the p value. The difference (XCS-C4.5) the average of 0.0214, the standard deviation is 0.11428 and the standard deviation error is 0.03614. The 95% confidence interval for the difference is -0.06035~0.10315, contain ‘0’. Indicating that the XCS-C4.5 model is not significantly different. The same result is obtained from the p value.
|
Paired Differences |
t |
df |
Sig. (2-tailed) |
||||
Mean |
Std. deviation |
Std. error mean |
95% Confidence interval of the difference |
|||||
Lower |
Upper |
|||||||
Pair 1 XCS – LR |
.03380 |
.11196 |
.03541 |
-.04629 |
.11389 |
.955 |
9 |
.365 |
Pair 2 XCS - C4.5 |
.02140 |
.11428 |
.03614 |
-.06035 |
.10315 |
592 |
9 |
.568 |
Table 8 The results of 10-fold cross validation.
Impact of balanced data sets
In chapter 3.1, this study use K-means to balance the data set. This method can converge the number of marketing mixes that do not click on advertisement to match the number of marketing mixes that clicked on advertisement. This method may lose information about the marketing mix that does not click on the advertisement (The group separated from K-means is not representative). Perhaps, marketing mixes that does not click on the advertisement really have the same characteristics to predict the click or not click. Therefore, to use another method of balancing data set. In our data set, 207 marketing mixes belong to the group that click the advertisement, 736 marketing mixes belong to the group that not click the advertisement. In this chapter, 207 marketing mixes belong to the group that click the advertisement increased to 736, that is up sampling. The following is the result (Table 9), in Figure 20 and Figure 21, it can be seen that up sampling has increased the accuracy of the model.
|
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
Average |
|
IR |
Self calibration |
0.737 |
0.748 |
0.749 |
0.746 |
0.731 |
0.733 |
0.73 |
0.745 |
0.75 |
0.74 |
0.741 |
cross. validation |
0.71 |
0.697 |
0.728 |
0.762 |
0.70% |
0.775 |
0.722 |
0.735 |
0.673 |
0.724 |
0.728 |
|
C4.5 |
Self calibration |
0.842 |
0.793 |
0.795 |
0.845 |
0.815 |
0.809 |
0.833 |
0.82 |
0.808 |
0.829 |
0.819 |
cross. validation |
0.792 |
0.754 |
0.747 |
0.769 |
0.772 |
0.728 |
0.769 |
0.762 |
0.705 |
0.782 |
0.758 |
|
XCS |
Self calibration |
0.962 |
0.965 |
0.951 |
0.953 |
0.955 |
0.954 |
0.957 |
0.957 |
0.962 |
0.954 |
0 a57 |
Cross. validation |
0.85 |
0.81 |
0.816 |
0.857 |
0.83 |
0.816 |
0.857 |
0.844 |
0.837 |
0.805 |
0.832 |
Table 9 The results of the three methods with another balancing data set.
Model comparison (after a different balanced data set method): The results of self-consistency test evaluation show in Table 10 The results of the table from left to right, first for the difference (XCS-C4.5) the average of 0.0742, the standard deviation is 0.02882 and the standard deviation error is 0.00912. The 95% confidence interval for the difference is 0.05358~0.09482, do not contain ‘0’. Indicating that the XCS-LR model is significantly different. The same result is obtained from the p value. The difference (XCS-LR) the average of 0.104, the standard deviation is 0.03605 and the standard deviation error is 0.0114. The 95% confidence interval for the difference is 0.07821~0.12979, do not contain ‘0’. Indicating that the XCS- LR model is significantly different. The same result is obtained from the p value.
|
Paired differences |
t |
df |
Sig. (2-tailed) |
||||
Mean |
Std. deviation |
Std. Error Mean |
95% Confidence interval of the difference |
|||||
Lower |
Upper |
|||||||
Pair 1 XCS - C4.5 |
.07420 |
.02882 |
.00912 |
.05358 |
.09482 |
8.140 |
9 |
.000 |
Pair 2 XCS - LR |
.02140 |
.03605 |
.01140 |
.07821 |
.12979 |
9.122 |
9 |
.000 |
Table 10 The results of self-consistency test evaluation.
The results of 10-fold cross validation show in Table 11. The results of the table from left to right, first for the difference (XCS-C4.5)) the average of 0.1381, the standard deviation is 0.01938 and the standard deviation error is 0.00613. The 95% confidence interval for the difference is0.12424~0.15196, do not contain ‘0’. Indicating that the XCS-LR model is significantly different. The same result is obtained from the p value. The difference (XCS-LR) the average of 0.2161, the standard deviation is 0.0082 and the standard deviation error is 0.00259. The 95% confidence interval for the difference is 0.21024~0.22196, do not contain ‘0’. Indicating that the XCS-C4.5 model is significantly different. The same result is obtained from the p value.
|
Paired Differences |
|
|
|
||||
|
Mean |
Std. deviation |
Std. error |
95% Confidence interval of the difference |
t |
df |
Sig. (2-tailed) |
|
---|---|---|---|---|---|---|---|---|
Lower |
Upper |
|||||||
Pair 1 XCS • C4.5 |
.13810 |
.01938 |
.00613 |
.12424 |
A 5196 |
22.539 |
9 |
.000 |
Pair 2 XCS- LR |
.21610 |
.00820 |
.00259 |
.21024 |
.22196 |
83.355 |
9 |
.000 |
Table 11 The results of 10-fold cross validation
Impact of data imbalance
In chapter 3, using K-means to down sampling on the marketing mix that belong to the group that not click the advertisement. 736 marketing mixes decreased to 207. Aggregate the marketing mixes that not click the advertisement after K-means and the marketing mixes that click the advertisement into a data set, we expect the data set can extract the characteristic distribution (gender, age, occupation, and user property) of marketing mixes that click the advertisement. In chapter 3.1 analysis results found that this data set (equal-setting) cannot have a higher accuracy, precision and recall. In chapter 3.2 analysis results also see XCS even if there is a high degree of accuracy, but also cannot be significantly different with the LR and C4.5. In chapter 3.3, up sampling 207 marketing mixes that click the advertisement increased to 736, regardless of LR, C4.5 or XCS to build the model, have a higher accuracy. This explains that there may be information missing from down sampling on marketing mixes that not click the advertisement. Indicating that the characteristic of marketing mixes that not click the advertisement may have used to distinguish the advertising audience is efficient. In chapter 3.1, it can also be observed that no matter what model marketing mixes that not click the advertisement has a higher precision.
Interesting rules from XCS
Modeling with XCS has an advantage, after modeling is complete can view the population set in the rule set (rule) mode. The knowledge base established after modeling can show the characteristic distribution under the condition (marketing mix) of the model. Therefore, every condition after the completion of modeling in XCS can be sorted according to fitness. A rule set with a higher fitness value is given a higher chance to matching. Based on XCS using clustering techniques, the larger probability of the rule set could be presented as a characteristic of the data set used in the modeling process. (In this application, for the marketing mix of characteristic distribution).
In this study, training model with XCS ten times to obtain ten different models, each time according to fitness size sort, take the top 10 higher value rule set. These rules can be expected parse out which characteristic is beneficial to click and which characteristic is beneficial not to click.
XCS for analysis of age: Training model with XCS ten times to obtain ten different models (A~J), each time according to fitness size sort, take the top 10 higher value rule set. Compare these 100 rules with the original 943 marketing mixes Observe whether the XCS knowledge base has learned the correct rules. In Figure 22, it can see the age distribution of the original 943 marketing mix. In age distribution of the original 943 marketing mix, in the top two high-click rule (1.45-47(0101) 2.51-53(1100)) accounted for 28% of the rules of the XCS trained (nearly 1/3) (Figure 23).
XCS for the marketing mix overall analysis: In the rule of 100 trained by this XCS, we can see that the rules of the highest fitness in the two models are the same, and these two rules(marketing mix) at the same time meet the coding of 男(0), 51~53(11000)資訊科技與技術(0011),Yahoo使用者(0110)和照片上傳工具(0100) (Figure 26). Indicating that these characteristics are easily judged to be characteristics of the clicked advertisement. In the original 943 marketing mix also did find rule condition 1.男(0), 51~53(11000), 資訊科技與技術(0011), Yahoo使用者(0110) 2.男(0), 51~53(11000), 資訊科技與技術(0011), 照片上傳工具(0100) will click the advertisement (Figure 27). This means that in 943 marketing mix, XCS successfully found the two rules.
XCS model for the purpose of not clicking the advertisement
In the previous experiments, all of the clickers were coded as 1 and no clickers were 0. The payoff of no clickers is 0. However, if the encoding order is reversed, the clickers were coded as 0 and no clickers were 1, then the payoff of clickers is 0. In the modeling process with XCS, if the click is for the purpose of the training model, since the non-clicked marketing mix payoff appears to be likely to be overly ignored. So, in the case of a training model for the purpose of not clicking, there may be a different performance. This will highlight more characteristics that are not conducive to non-click, these characteristics are advertisers to avoid. With the training model for the purpose of clicking. It will be effective for advertising. For the purpose of click to training model called model I. For the purpose of not click to training model called model II. The following is the result. It can be seen that the number of times of convergence of model II is lower than that of model I (Figure 28), indicating that if we train the model for the purpose of not clicking, there will be a faster convergence rate. But there is no significant difference in accuracy (Figure 29).
Dilemma
The limitations of this study for the use of Facebook platform to collect the information is not entirely dynamic data. Only use the marketing mix of click feedback to do the basis for advertising. If we want to collect more information, have to pay more money to use a different platform to collect information, it is more difficult to reach. Facebook has higher authenticity relative to the virtual name of the other platform. Hope to try to be close to our research purpose.
By XCS, this study build knowledge base with high accuracy and its high-efficiency attributes indeed been extracted. A marketing mix with certain characteristic has a higher chance to click advertisement. Different data set balancing methods will affect the significant differences between XCS and other methods. It is estimated that down sampling will reduce the information and will reduce the update diversity of the XCS database. Modeling with XCS, As the no-click knowledge base has the same accuracy, it can be used as an addition of advertising strategy. It can help to reduce the probability of non-target audience.
In this paper, introducing market targeting for the first time to analyze Facebook advertising audience data, and import classification methods to optimize advertising. Among them, try to read the history of advertising, and reasonable speculation that modern advertising will tend to interactive mode. With Facebook platform, a large number of interactive data combined with machine learning classification method makes this experiment possible. Facebook platform with XCS there are some information to help advertising, may be able to provide new ideas for advertising research.
In this study, the establishment of the knowledge base as the focus. But the lack of XCS to build the knowledge base to perform the next operation of advertising. Only test the relationship between the model and the marketing mix and explore the Facebook platform with XCS whether there is some information to help advertising. For the unknown marketing mix there is no chance of verifying its accuracy. In the dynamic data is more important interactive environment may be more disadvantage. If it has time and money in the future, it will be able to use the model to advertising again to see its click through rate.
The manual adjustment of the marketing mixes is a labor force, and because ability of XCS to screen characteristics (whether locked or removed), XCS will have the potential to automate adjustment of the marketing mixes.
Interactive advertising current trends
Since Rodgers and Thorson proposed IAM (interactive advertising model) in 2000, there were 385 cite numbers, of which 243 (63.1%) were journal articles. And in recent years because of the rise of the mobile phone media, the new advertising specifications will tend to social media advertising and mobile advertising. Which Facebook, twitter, and YouTube are the most discussed.18 And there are several types of advertising research for the current trend, such as mobile games are now playing ads, which, Siemens and Smith19 also discussed if the user can take the initiative to control the advertising rights, what will happen. They also put forward the concept of control and attention of mobile game advertising, handling fluency, etc. In dynamic advertisement, “TrueView in stream ads” on video platform like YouTube is a point of discussion. Kononova20 put forward TrueView in stream ads will be better convincing.
And if the game advertisement can distinguish consumer’s properties on the game, and the TrueView in stream ads on YouTube able to analyze what kind of property the film is viewing. With market targeting will provide advertisers a chance to get more high effect.
©2017 Wu, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7