{"id":3903,"date":"2022-01-31T09:18:48","date_gmt":"2022-01-31T08:18:48","guid":{"rendered":"https:\/\/datascience.unifi.it\/?page_id=3903"},"modified":"2022-01-27T17:24:33","modified_gmt":"2022-01-27T16:24:33","slug":"interview-with-andrea-marino","status":"publish","type":"page","link":"https:\/\/datascience.unifi.it\/index.php\/what-data-scientists-really-do-we-ask-them-for-you\/interview-with-andrea-marino\/","title":{"rendered":"Interview with Andrea Marino"},"content":{"rendered":"<p><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">Andrea Marino, researcher at the Department of Statistics, Computer Science, Applications \u201cG. Parenti\u201d at University of Florence, was our guest to the D2 seminar presenting his study entitled: \u201cApproximating the Neighborhood Function of (Temporal) Graphs\u201d.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">When we talk about degrees of separation, some network can immediately came to our mind, social network like Facebook, based on \u201cfriendship\u201d relation, or the Internet Move Database, an online archive full of informations about films, tv series, cast, crew and so on, who began like a fan operated database, based on collaboration. These are some average distance graphs, largely investigated, but what happens when the number of nodes involved grows up to millions or billions? Computing this measure requires huge time and space costs since for each node must be computed the so-called neighborhood function. In the presented research the networks analysed were temporal ones, in which each edge has some sort of label specifying their occurring times. They can be thought like timetables of public transportation, where the notion of distance is the earliest arrival time between two nodes. The novelty of his study is that it introduces a probabilistic algorithm to approximate sizes and neighborhood functions of temporal graphs, to avoid the computation to go on endlessly.<\/span><\/span><\/span><\/p>\n<p><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>Ho<\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>w would you explain your results to someone who doesn\u2019t know <\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>what a temporal neighborhood function is<\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i> but <\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>has certainly heard about degrees of separation as a social network user?<\/i><\/span><\/span><\/span><\/em><\/p>\n<p><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">Given two individuals Alice and Bob, the degrees of separations correspond to the number of intermediaries between Alice and Bob, in the sense that, if Alice is separated by Bob with h degrees of separations, it means that Alice has a friend who knows a friend, who knows a friend, &#8230;, who knows Bob, where the number of these intermediate individuals is h. But what if we have many individuals and we want to compute the average degree of separation among all the pairs of individuals, not just between Alice and Bob? This is something very expensive to compute.<\/span><\/span><\/span><\/p>\n<p><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><span style=\"color: #777777;\">In order to do this, we can model friendships using a graph of friendships between individuals and study the distance between individuals in this graph. In particular for each individual, say Alice, and for each integer h, we are able to approximate the number of individuals x such that Alice needs less or equal than h intermediates to reach x, i.e. the number of individuals which are at distance within h from Alice (this is called the neighborhood function). This can be done in a recursive way, where an approximation for distance h can be obtained using the approximation for the distance h-1. The approximation makes use of probabilistic counting, where sets of friends within distance h are concisely and approximately represented in order to fasten the execution of the algorithm.<\/span><\/span><\/span><\/p>\n<p><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><span style=\"color: #777777;\">We have shown that the algorithm can be extended in the case of temporal graphs, where edges can be traversed only at specific times and the only valid paths to be considered are the ones respecting these time constraints. In this sense the temporal neighborhood of a vertex is composed by the vertices that can be reached not in h steps but within time t.<\/span><\/span><\/span><\/p>\n<p><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">D<\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>uring your presentation you mentioned<\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"> that your research create<\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">s<\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"> a bridge between computer science and data science, <\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">c<\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">ould you explain <\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">what is the outlook of <\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>th<\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"><i>e<\/i><\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"> study <\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">in this sense<\/span><\/span><\/span><\/em><em><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">?<\/span><\/span><\/span><\/em><\/p>\n<p><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">This research puts together graph theory and probabilistic counting. Probabilistic counting is a technique which allows us to approximate sets in order to estimate the number of distinct elements in them and in order to estimate the size of their unions. Probabilistic counting makes use of special sampling strategies which are very useful for designing randomized algorithms. While statistics takes care of the quality guarantees of these samples, i.e. unbiasedness and concentration, computer science takes care of the efficiency of computing samples, in this case focusing on the invariants to be maintained during the computation in order to maintain the guarantees. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">Andrea Marino is also a member of our center. More information about his research can be found on his personal <\/span><\/span><\/span><a href=\"http:\/\/www.andreamarino.it\/\"><span style=\"color: #dd1313;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">page<\/span><\/span><\/span><\/a><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">You can access the recording of this <\/span><\/span><\/span><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">as well as<\/span><\/span><\/span> <span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">to<\/span><\/span><\/span><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\"> other seminars at this <\/span><\/span><\/span><a href=\"https:\/\/us02web.zoom.us\/rec\/share\/D4dMlgndH2uIocZMuulCwACHEDlRgYYvdyNzKE1hmd8va-BuBeSZlKS1Rjt6b_yD.eMkS3ffDe4HW4sTC\"><span style=\"color: #dd1313;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">link<\/span><\/span><\/span><\/a><span style=\"color: #777777;\"><span style=\"font-family: Open Sans, Helvetica, Arial, sans-serif;\"><span style=\"font-size: small;\">. (Registration needed)<\/span><\/span><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Andrea Marino, researcher at the Department of Statistics, Computer Science, Applications \u201cG. Parenti\u201d at University of Florence, was our guest to the D2 seminar presenting his study entitled: \u201cApproximating the &#8230;<\/p>\n","protected":false},"author":12,"featured_media":3815,"parent":3762,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_monsterinsights_skip_tracking":false,"footnotes":""},"class_list":["post-3903","page","type-page","status-publish","has-post-thumbnail","hentry"],"_links":{"self":[{"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/pages\/3903","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/comments?post=3903"}],"version-history":[{"count":2,"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/pages\/3903\/revisions"}],"predecessor-version":[{"id":3906,"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/pages\/3903\/revisions\/3906"}],"up":[{"embeddable":true,"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/pages\/3762"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/media\/3815"}],"wp:attachment":[{"href":"https:\/\/datascience.unifi.it\/index.php\/wp-json\/wp\/v2\/media?parent=3903"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}