Author: Geonwoo Kim
Mentor: Dr. Charalampos Tsourakakis
Crean Lutheran High School
The advancement of the internet gave way to a new form of online interaction supported by social network sites. Over the past years, social network sites have witnessed tremendous growth in numbers. Over one billion users on Facebook and hundreds of millions on Pinterest, Twitter, and Google+. Sharing information on these social network sites has now changed how people communicate. Through the sites, an individual can create a public profile, connect with users with whom one has a connection, and share messages, videos, and images. This has led to unpredictable and emerging sharing, prorogation, interaction, and content creation amongst users. However, one of the most important and active research areas has been understanding the information diffusion on social network sites. Information diffusions refer to how information is spread among interconnected entities or nodes in a network. Studying information diffusions is linked to beneficial outcomes such as determining the various factors that affect the whole process. In addition, studying off data to data transmission across social network sites can help in various sectors such as marketing. The data transmitted in most cases on social network sites involves the source content: the images, video, or texts posted and includes geolocation, posting time, and other meta information.
According to Dasgupta and Gupta (1), social networks have been created by the billions of users who carry out various activities, including creating their profiles, linking, following, posting content commenting, and other online interactions. In most cases, evolved graphs have been used in modeling social media whereby nodes represent the entities, content, themes, and other meta-data. A typical social media graph will have both end and node properties in most cases. However, Dasgupta and Gupta (2) note that there has been increased research on determining the various subgraphs that allow various analysts to determine the connection between two nodes for years. The sparse nature of social media graphs supports the emergence of various ends between the two nodes (Faloutsos, McCurley, and Tomkins 1). Unlike in other graph analyses, when analyzing social media subgraphs, the representation of a single relationship between two nodes using a single path is limiting. It is thus essential to ensure that the connection between subgraphs is determined in the fastest way possible since it will help in identifying the few most likely transmissions paths of a disease, joke, information leak, or rumor from one user to the other (Faloutsos, McCurley and Tomkins 1). More importantly, this will make it easier to ascertain the unexpected affiliation between individuals or other members. Using graphs will help summarize the connection between two SNS users, thus providing the fastest means of determining how the data is transmitted between the users.
Given the importance of graphs in studying connections between SNS users, it will allow one to map out all the edges and identify the social position of every user. Yazdi et al. (141) note that the best strategy for analyzing data to data transmission across social network sites is using the theory of graphs. The diffusion patterns of information across SNS and its distribution have become a key study area. However, Yazdi et al.(142) argue that one of the most challenging problems has been finding out the best and fastest strategy to help determine and study data to data transmission across SNS and thus predict the diffusions paths based on an actual data that has many applications in critical areas such as gossips news, blog postings, virus resource detection, e-commerce among others. The popularity of given news plays an avital role in determining the nodes influenced in the future, which helps to ensure that nodes that influence the past are used in outlining the transmissions of the news. In this case, the future nodes will be predicted as a function of time (Yazdi et al., 142). The Louvain community detection algorithm was a widely used data to data transmissions research strategy. However, the inability to control the centers of clusters and their numbers made it ineffective. The importance of the centers of clusters is that they help in information propagation.
However, the densest subgraph algorithm can overcome the inefficiencies of the Louvain community detection algorithm since it supports a more efficient center of clusters provisions. Epasto, Lattanzi, and Sozio (1) note that various data analysis tasks such as distance query indexing, event detection, community detection, computational biology, among others, have been improved with the emergence of finding densest subgraphs. The various users across SN have been compared to actual communities, given that most share similar interests or have an affiliation to the same company, university, or other organization. In most cases, the emergence of certain words affiliated to place, cities, company names of even persons on tweets and posts can indicate something affiliated to a given event about to take place. The emergence of the densest subgraph algorithms has been used to study the data-to-data transmission across SNS. This allows analysts to determine the compact representation of node distances in a graph. This, in turn, allows them to compute the distance between two nodes and time and determine the data transmissions rates.
In most cases, new people will always join SNS while others leave, new friendships will be formed, and others will end. Moreover, new tweets and postings on SNS such as Facebook and Twitter will mean the older tweets have become less interesting. The result is that the SNS users’ communities will evolve with time, leading to the emergence of new events that trigger the formation of new densest subgraphs. The node distances will continually change in the long run, thus calling for frequent re-indexing. This can thus significantly hinder research into data-to-data transmission across SNS hence requiring algorithms that can keep up with the ever-evolving users and large and highly dynamic data input streams.
Moreover, graphs have been used to identify various concepts, not only social media but also biological and financial networks. However, given that the common problem is to find the most significant number of connections between nodes, there is a need to determine the best solution. Given that most communities within social media networks are based on the formation of communities, this will lead to a need for a mathematical task that will help detect data-to-data transmissions between various users known as the densest subgraph. In most cases, the number of edges divided by the maximum possible number of edges equals the density of a k-node subgraph. This indicates that by finding the density of graphs, one will determine the data-to-data transmission between various communities and even narrow it down to the respective metadata such as location and time. Tsourakakis (1) argues that various data mining techniques have been employed to determine the data-to-data transmissions across SNS. Most subgraph techniques have tried to ascertain which ones are near-cliques, resulting in the emergence of the NP=hard problem associated with the densest subgraph. However, there have been many types of research aimed at coming up with solutions towards solving the densest NP=hard problem, and it has proven to be solvable hence making the algorithm more effective than previous graph mining applications (Tsourakakis, 1).
Various graph density concepts are used in determining the densest subgraph. One of the concepts is edge density. In these cases, one determines the density measure by dividing the number of edges with the node numbers. Another fundamental concept is the k-core that allows one to ascertain the subgraph with the largest minimum degree instead of its average degree. The K-core concept was introduced in 1970 by Lick and White and was later analyzed in many other papers (Farago and Mojaveri 4). The k-core has been widely used in the densest subgraph since it is easy to find algorithmically. Therefore, mathematically speaking, the densest subgraph would be the best means for analyzing data to data transmissions on social media websites.
On the other hand, transmitting data from one user across social media websites is essential. The development of mobile-based communications has allowed people to access various SNS on their smartphones. Various challenges have marred the traditional mesh network, making it hard for data transmissions from one user to the other ( Yang, Wu, and Luo 1). This has led to the emergence of an opportunistic network that, unlike the traditional mesh network, does not support the advanced setting of the network size and node location. In addition, there is no deed for setting up a complete path between the target node and the source node. The main advantage of the opportunistic network is that it allows nodes to enter the communication range, thus facilitating a much faster exchange of data between users. Yang, Wu, and Luo (2) note that the opportunistic network will thus help eliminate the problems arising from the wireless technology networks, such as network delays network splits, and also be able to ensure that the network communication is much less expensive.
Opportunistic networks are linked to remote area network transmission, handheld devices networking, in-vehicle networking, and tracking wildlife. However, with the invention of the 5G network, tablet and Bluetooth computers, smartphones and laptops have increased in number and have also been widely distributed across large geographical areas. People can now move from one place to another with the devices, which has led to the formation of a social node. Therefore, unlike the traditional signal transmissions, which affect data transmission across nodes by affecting data acceptances opportunist network improves the broadcast characteristics within the interference range hence eliminating node broadcast delay (Yang, Wu, and Luo, 2). This will lead to low latency and high data transmission across the SNS.
More importantly, opportunist networks across SNS will ensure that it supports the store-carry-forward transmission strategy. In this strategy, the data is sent from the source node to the destination node even when there is no network availability (Xiao and Wu 3). However, to ensure that the opportunistic networks support efficient data transmissions across SNS, it is vital to have an efficient routing algorithm in place. The study on routing algorithms s aim opportunities networks has been widely debated, leading to the proposal of countless routing algorithms. Vahdat Amin and David Becker proposed the epidemic routing algorithm using several meeting nodes to help in data transmissions. The epidemic routing algorithm has been cited as supporting a reduced data transmissions delay improving the average hops and average delay times. Another algorithm referred to as Spray and Wait were proposed by Spyropoulos et al. (253), which sought to overcome the various shortcomings associated with the epidemic algorithm. This algorithm works based on two phases: spray and wait for phase. The L copies of the data are sprayed by the source node towards neighboring nodes, after which the wait phase starts. One must wait for some time before the messages are thus sprayed to the destination node. The core aim of the Spray and Wait algorithm is to support a much faster transfer rate across nodes. It is thus essential to ensure that the best routing algorithm is selected when setting up opportunistic networks to support faster data transmission rates across SNS.
Data to data transmission across social network sites is essential since it supports the ongoing interaction between the users. Across social network sites, information diffusion represents the process via which data and information are transmitted from one user to the other across social network sites. Every social network site must ensure that the information diffusion process is fast to eliminate any disappointment amongst users. Therefore, when studying data to data transmissions across social network sites, one of the fastest ways to support the whole process is using the densest subgraphs. The densest subgraph makes it easier for analysis to map out all the data-to-data transmissions between users, thus making it easier to ascertain the social position of every user. The densest subgraph helps overcome the inefficiencies slinked with the Louvain community detection algorithm. On the other hand, data-to-data transmissions from one user to the other are critical. The traditional wireless technologies have been marred by high latency rates and low data transmission. The ever-increasing smartphone, tablet, and Bluetooth computers have made distributing users across a larger geographical zone easier. However, with the emergence of the 5G network, using an opportunistic network will support a much faster data-to-data transmission. However, it is essential to ensure that the routing algorithm uses supports the fast data-to-data transmission when using an opportunistic network.
Dasgupta, Subhasis, and Amarnath Gupta. “Discovering interesting subgraphs in social media networks.” 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 2020.
Epasto, Alessandro, Silvio Lattanzi, and Mauro Sozio. “Efficient densest subgraph computation in evolving graphs.” Proceedings of the 24th international conference on the world wide web. 2015
Faloutsos, Christos, Kevin S. McCurley, and Andrew Tomkins. “Connection subgraphs in social networks.” SIAM International Conference on Data Mining, Workshop on Link Analysis, Counterterrorism and Security. Vol. 2. 2004.
Faragó, András, and Zohre R Mojaveri. “In search of the densest subgraph.” Algorithms 12.8 (2019): 157.
Spyropoulos, T., Psounis, K., & Raghavendra, C. S. (2005, August). Spray and wait: an efficient routing scheme for intermittently connected mobile networks. In Proceedings of the 2005 ACM SIGCOMM workshop on Delay-tolerant networking (pp. 252-259).
Tsourakakis, Charalampos E. “A novel approach to finding near-cliques: The triangle-densest subgraph problem.” arXiv preprint arXiv:1405.1477 (2014).
Vahdat, Amin, and David Becker. “Epidemic routing for partially connected ad hoc networks.” (2000): 2019.
Xiao, Yutong, and Jia Wu. “Data transmission and management based on node communication in opportunistic social networks.” Symmetry 12.8 (2020): 1-13.
Yang, Weiyu, Jia Wu, and Jingwen Luo. “Effective data transmission and control based on social communication in social opportunistic complex networks.” Complexity 2020 (2020).
Yazdi Majbouri, Kasra, Adel Majbouri Yazdi, Saeid Khodayi, Jingyu Hou, Wanlei Zhou, Saeed Saedy, and Mehrdad Rostami. “Prediction optimization of diffusion paths in social networks using integration of ant colony and densest subgraph algorithms.” Journal of High-Speed Networks 26, no. 2 (2020): 141-153.