ISRF: interest semantic reasoning based fog firewall for information-centric Internet of Vehicles

: Nowadays, Internet of Vehicles (IoV) has attracted lots of attention as an important component of intelligent transportation. Meanwhile, information-centric network is a meaningful next-generation network with content-oriented attributes, which has been introduced into IoV for more flexible services. In information-centric IoV, there are abundant semantic features because of the content big data in the network layer. As very few works of information-centric IoV consider semantic models, current security approaches in information-centric IoV cannot defence against efficiently novel threats with penetrated malicious content and semantic obfuscation, especially at the network edge. Thus an efficient firewall is a must for information-centric IoV. However, existing firewalls based on traditional computing resources cannot detect content threats implicated in semantics at the edge of networks. In this study, the authors propose an interest semantic reasoning based fog firewall for information-centric IoV. Firstly, a fog firewall is proposed to construct an edge isolation for information-centric IoV. Moreover, interest semantic reasoning with context weight and knowledge graph is proposed to mine implicit relations and illegal knowledge in names and contents. Based on knowledge constructed from reasoning, it aims to perceive threats from interest packet in information-centric IoV. Simulations verify its feasibility and efficiency of the proposed scheme.


Introduction
In the new era of Internet of Things (IoT), researchers focus on launching the development of connected Vehicle Networks into the Internet of Vehicles (IoV). Vehicles are connected to the network to communicate with other vehicles, roadside units (RSUs), infrastructure and cloud centres [1]. IoV aims to provide timely and reliable communication of hazard/warning and congestion information among vehicles to improve the safety and driving experiences [2]. In addition, vehicles are expected to act as personal computers provide entertainment and business functions. Therefore, plenty of IoV services have appeared. For example, traffic condition broadcasting services avoid traffic jams and danger predication and alert services provided safe driving experiences. All these services above aim to provide extensive information locally. Moreover, some conventional Internet applications also occupy a certain proportion of IoV services to provide better travelling experiences, such as social services, multimedia applications and so on. Numerous vehicles are equipped with on-board computers to provide entertainment and business services to passengers. These kinds of services have some commons that they are big data streams and tend to be shareable. Thus, service-oriented information processing and transmission dominates the IoV environment [3].
However, the demand of exponentially growing IoV traffic is changing from IP-based communications to content-centric communications such as multimedia streaming and information sharing [2]. On the one hand, IP address-oriented connections are difficult to adapt high mobility, localised communications and extreme dynamic environments of IoV. The TCP/IP protocol needs to be maintained when high mobility is damaged the connection. On the other hand, the rapid changing network topology in a mobile environment tends to result in inefficient routing. Therefore, information-centric network (ICN) is regarded as an alternative to IP networks in IoV. ICN is a promising paradigm for the next generation networks [4]. It focuses on providing information-oriented network protocols, including content-centric subscribe mechanism and semantic-dominated naming, routing and caching strategies. Therefore, the unique information-centric IoV structure leads to abundant semantic features and appliance of content subscription and distribution as well as cache.
The reasons why ICN is introduced into IoV are as follows. First, different from the IP end-to-end network architecture, ICN supports data transmission even in unreliable and extreme environments, like frequently dynamic connections for mobile and ubiquitous computation [5]. Second, in most occasions, users in IoV do not care about the sources of information or services, but the content itself. In addition, some popular data tend to be frequently requested in some conditions or locations, in this case, the in-network cache can provide low-latency services to terminal users. Therefore, many existing works have studied the information-centric IoV. And the scene diagram of informationcentric IoV is shown in Fig. 1.
Featured with a content-driven subscription, information-centric IoV suffers from novel penetration threats that are implicated in content. Packet names are generated from content attributes and utilise human-readable languages, thus causing the potential of semantic vagueness and ambiguity. Firstly, names of interest packets could have potential semantic features and relevance with illegality. Moreover, attackers tend to send implicitly forbidden content that is obfuscating and correlated with requests in some semantic dimensions, which can lead to the failure of detection with the inherent firewall blacklists. Therefore, the contentoriented subscribe mechanism offers the potential for content obfuscation and spoofing attacks [6]. It is important to perceive content threats to identify various illegal features in content. Defence of penetrated content threats needs semantic reasoning analysis to identify the inherent and potential illegality features of content and to mine knowledge relations between traffic. Therefore, semantic based security knowledge policies are required on top of the traditional defence. However, existing works of information-centric IoV security focus on attacks of naming, routing and caching dispersedly [7]. They neglect content semantic analysis to mine potential threats and construct knowledge relations through the nature of the different content. Moreover, lacking in content awareness and semantic analysis, existing works cannot implement potential threat awareness of packet content or customise defence policy for the content of each distinct packet.
Moreover, conventional TCP/IP firewalls are incompatible with information-centric IoV protocols and data structures. Firstly, content-oriented packet forms and subscription schemes of information-centric IoV limit the function migration of TCP/IP firewalls. Secondly, existing firewalls depend on traditional computation paradigm and fail to conduct content semantic analysis and content threat awareness on the edge of networks. Thirdly, blacklists and blocking policies of existing firewalls are created in the cloud centre uniformly, all the firewalls apply unified policies without any analysing tools and neglecting the heterogeneity of different network environments, which causes numerous policy errors. Last but not least, current firewall databases of threat information are deployed in the cloud centre, it is inconvenient to upload and share the various kinds of novel semantic threats in information-centric IoV dynamically. Thus, existing firewalls cannot construct the isolation defence system for information-centric IoV.
To address aforementioned challenges, we propose a fogenabled firewall for information-centric IoV to provide isolation defence as well as content-threat awareness with interest semantic reasoning and policy customisation. To perform the edge isolation and semantic content awareness for information-centric IoV, fog computing, known as transferring intelligence and data from cloud centre to edge network [8], is adapted to offer computation and edge distribution in the proposed firewall system. We employ fog computing to support context and content awareness of information-centric IoV packets and conduct the semantic analysis and customised configuration of information-centric IoV firewalls. Additionally, a semantic reasoning approach based on knowledge graph is designed. It aims to construct security knowledge and mine possible potential relations between the requested content and blacklists involving the illegal information. Besides, to defend latent and continuous attacks, the proposed firewall model collects context traffic correlated with pending packets, which emphasises relevant semantic dimensions to guide the reasoning direction. Semantic reasoning analyses the content attributes through interest names and thus predicts potential threats of pending packets and their response data. Customised policies for each packet are generated as a result of semantic analysis. Moreover, customised configuration supports users to specify filtering policy to exclude unexpected responses beyond the blacklists. Our contributions are as follows: Additionally, a semantic reasoning approach based on knowledge graph is designed. It aims to construct security knowledge and mine possible potential relations between the requested content and blacklists involving the illegal information. Besides, to defend latent and continuous attacks, the proposed firewall model collects context traffic correlated with pending packets, which emphasises relevant semantic dimensions to guide the reasoning direction. Semantic reasoning analyses the content attributes through interest names and thus predicts potential threats of pending packets and their response data. Customised policies for each packet are generated as a result of semantic analysis.
Moreover, customised configuration supports users to specify filtering policy to exclude unexpected responses beyond the blacklists. Our contributions are as follows: • We propose a fog-based firewall system for information-centric IoV, which utilises content awareness and edge-distributed computation of fog to construct information-centric IoV edge isolation, and thus to prevent threatening traffic and requests entering information-centric IoV. • The semantic reasoning scheme to mine potential threats in packets is proposed with knowledge graph. Firstly, a context selection scheme based on semantics is proposed to guide the reasoning direction of interest packets. Moreover, weighted semantic reasoning is designed to reason the threatening relations and knowledge with interest packets and thus to supplement blacklists with potential threats in both interest and data packets. Based on the knowledge constructed from reasoning, the proposed semantic reasoning scheme perceives penetrated and obfuscating content threats, and configures customised defence policies with distinct interests. • The proposed information-centric IoV firewall system can defend against both content threats and network layer threats in information-centric IoV. The considered threats cover network layer attacks of anomaly requests, packet tempering and falsifying, content as well as cache attacks of information leakages, illegal and obfuscating data.
The remaining of this paper is organised as follows: Section 2 describes the research related to this paper. The design principles of content threat-aware fog-firewall system are introduced in Section 3. In Section 4, the firewall architectures and workflow are designed. Section 5 presents the context based interest semantic reasoning for threat-aware policy customisation. Simulations and the preliminary results are provided in Section 6. Finally, we draw our conclusion in Section 7.

Related work
In recent years, there have been frameworks based on informationcentric IoV to combine ICN and IoV. Researchers have applied ICN into IoV to overcome the above disadvantages of the IP-based end to end communication. Moreover, it is compatible with the fact that IoV network services are typically dominated by content distribution.
In [2], Li et al. proposed a novel crowd sourced VCCN framework to provide secure and efficient information distribution for IoV. It supports vehicles to crowdsource their cache resources and radio links then to distribute collaborative content. Besides, the authors in [9] designed a one-hop large-size content distribution mechanism for IoV, which applies the emerging NDN to V2V communications. It enables services including nearby requests, intelligent responses, multi-source supplies and breakpoint resume to be fulfilled. In [10], a hierarchical location based content naming mechanism was proposed for IoV. Besides, the authors also designed a distributed mobility management method via taking directions of movement in the content names into consideration. However, the aggregation of the proposed location based mechanism will be declined due to the mobility of vehicles.
Recently, some researchers have focused on the security and communication study of IoT [11][12][13] including IoV [14], Smart Grid [15][16][17][18] and SDN [19] as well. However, studies on information-centric IoV firewalls are still in the infancy. In [20], a study of CCN security requirements is introduced and the first CCN firewall with syntax is designed. However, filtering is conducted for isolated packets without consideration of context, leaving vulnerabilities of latent and continuous attacks. Moreover, the proposed semantic analysis focuses on name syntax while neglecting relations between packet and blacklist. In [21], the distributed context-aware firewall takes account of the specialised knowledge of protected services to decrease processing delay. However, it fails to be applied in information-centric IoV scenarios for distinct network protocols and architecture. In [22], a simple filter technique based on web URL statistics for firewalls is proposed to detect anomalous information-centric IoV names. The authors in [23] present a novel system for detecting and avoiding poisoned content, which leverages the verification work that users must do anyway. The above studies propose efficient protection policies of information-centric IoV, while little attention is paid to constructing the firewall system for it. For the employing of information-awareness [24], ICN will be widely used in 5G, Smart Grid and Vehicle to Grid [25][26][27]. An efficient access control framework for information-centric network (ICN) is proposed in [28], which allows legitimate users to access and utilises the cached content directly without verification/ authentication. The work in [29] designs a secure content distribution architecture for CCN that is based on proxy reencryption. Misra et al. create a secure content delivery framework especially applicable for mobile devices in [30]. Zhang et al. [31] proposed a cooperative random interest propagation mechanism to prevent content-client link ability. Existing studies provide good support to resolve various information-centric IoV content threats in network layer. However, information-centric IoV content threats penetrated in semantics and ambiguity are neglected.
Fog computing has been widely applied in next generation networks [32][33][34][35]. Studies involve optimising computation resources of the application layer [36], making use of content awareness to reassign services [37]. Considering the advantages of content awareness and edge distribution, fog computing can provide great benefits to information-centric IoV firewalls in the performance of threat-aware filtering and semantic reasoning to construct edge defence isolation.
To sum up, much more attention needs to be paid to information-centric IoV firewalls with fog computing on the edge of networks. Interest semantic analysis to mine implicit threats in content is of vital importance for content obfuscation and penetration. Customised policy configuration with user and threat awareness should be considered for distinct traffics.

Design principles of content threat-aware fogfirewall system
This section showed the overview of proposed filtering policies and prevented information-centric IoV threats. Two types of the considered network boundaries were designed where proposed fogfirewalls were placed.

Defence principles and categories of threat defence policies
The threats we considered for information-centric IoV security isolation are divided into two aspects: content threats and network layer threats. Content threats refer to information leakage, illegal or unexpected content transmission, and content obfuscation. Content threats also cover cache attacks that attackers fulfil the cache nodes with invalid or unexpected content. For the network layer, threats include anomaly requests, spoofing that attackers monitor, block, temper and falsify the packets [6,7] and illegal participants. Table 1 shows the considered security threats and our proposed filtering detection objects. The proposed information-centric IoV firewall system aims to provide a security isolation system defending both content and network layer threats. The filtering principles are designed as follows.
Data integrity checking: Detect if data is tempered or producers in blacklists send responses so that illegal or uncorrelated content is sent out with requested names.
Content filtering: On the one hand, with the analysis of interest names, anomaly requests of illegal or malicious invalid data are prevented from entering information-centric IoV. For data packets, illegal and unexpected content of data packets are blocked with the blacklists on the boundaries, which avoids invalid data being cached. On the other hand, security content isolation between different networks is under consideration, which includes unauthorised access and information leakage.
Customised filtering policy: Besides the inherent blacklist, semantic reasoning of information-centric IoV interest is conducted to mine potential threatening knowledge, and then supplement blacklists for customised policy configuration. Moreover, components of information-centric IoV packets, including name components, selectors, ForwardingHint, producer signatures and content [5] are extracted as filtering policies to exclude unexpected sources and content of response packets. Thus the proposed firewalls support consumers to dynamically configure filter policies.

Content threat awareness with interest semantic reasoning
Considering the inherent ambiguity in human-readable language, content obfuscation and penetration are novel information-centric IoV threats. Malicious nodes send implicitly forbidden content as data packets that are obfuscated and correlated with requested names in some semantic dimensions. It can lead to the failure of recognition with the inherent firewall blacklists. As for interest packets, the names of the requested content may have potential threatening relations with entities in blacklists while do not appear explicitly.
Therefore, we pay special attention to content obfuscation attacks for its unique need of semantic analysis. For content threats with semantic ambiguity and penetrated illegality, an interest semantic reasoning is designed with knowledge graph to mine potential threatening knowledge through names of interest packets. Threats to be mined include relations between entities in requested content attributes and in blacklists, and threatening entities compositing the potential relations between packets and blacklists. Based on semantic reasoning of each information-centric IoV interest packets, we detect the implicit illegal features of interest and then predict obfuscating threatening responses to be excluded. Furthermore, for latently and continuously attacks, context communication related with the pending packets in content, names or producers are taken into consideration, which guides the reasoning direction to focus on the semantic dimensions of requests. The above mentioned schemes are introduced in Section 5.

Edge defence isolation of proposed fog-firewall system
As shown in Fig. 2, IoV nodes are deployed at the edge and bottom layer of the whole networks as terminals, including vehicles, distinct kinds of sensors on vehicles, passengers and other infrastructures. They act and serve as the data creator to generate data. Terminals do not need to be equipped with firewalls because many of them cannot bare the required computation and storage for a firewall service. Besides, by deploying firewalls in the upper fog nodes, it helps the firewall systems conduct more comprehensive analysis for traffic and perceive threats with not only a single node but the covered local networks.
Fog computing nodes are deployed at the middle layer in the information-centric IoV. They act as a bridge and also a security isolation system. In this paper, two kinds of edge defence isolation are introduced based on different attacks for vehicles and Intranet. On the one hand, terminal isolation is constructed between information-centric IoV terminals and networks. Taking advantages of the conventional terminal firewalls, it can isolate illegal communication between terminals and information-centric IoV. Terminal isolation is deployed logically between terminals and next hop of information-centric IoV nodes to prevent malicious data from entering information-centric IoV. As for the physical deployment, it is placed on fog nodes that achieve seamless coverage of terminal geographically. Each terminal is in a jurisdiction of one corresponding fog-firewall. Moreover, due to the computation of fog nodes, the burden for terminals of configuring firewalls is sharply reduced.
On the other hand, network isolation that prevents malicious data from spreading between different information-centric IoV networks is established. Traditional network boundaries that connect different levels of security networks are considered here to isolate communication between Intranet and Extranet. Thus, information-centric IoV network isolation is deployed on fog nodes beside gateways. A physical gateway will route both inside and outside arriving packets to the network firewall after process.
ICN nodes are the components of the top layer. They act as the core net and provide addressing, routing and caching functions. Fog nodes are responsible for building the isolation system between the other two layers.
We describe the attack and defence scenarios in Fig. 2 as well. The scenarios are classified according to attacking sources. For malicious requests, we consider both anomaly requests from edge terminals and unauthorised requests between LANs. The formal can be detected by signatures and packet names in terminal firewalls, thus they cannot access the network. This is because signature and name component include the identity and content abstract information which cover the threat information. The latter refers to unpermitted access towards external LANs and it cannot be detected by terminal firewalls. However, on the gateways, firewalls can detect and block it according to packet names. Therefore, unauthorised access can be prevented outside the network. On the other hand, for malicious responses, we give the defence scenarios of threats considered in Section 3.1. Data attacks including cache, obfuscation, falsifying and leakage could be blocked by parsing the signature, name and data component of packets to check the security of data. Moreover, illegal data producers can be detected by the signature.

Proposed content threat-aware fog-firewall system with customised policies
The proposed content threat-aware fog-firewall system for information-centric IoV consists of terminal isolation fog-firewalls (TI2Fs) and network isolation fog-firewalls (NI2Fs). TI2Fs block illegal requests as well as data of terminals entering informationcentric IoV. NI2Fs detect Extranet illegal access and prevent Intranet data leakage. In this section, we design both the proposed TI2F, NI2F and the workflow of the firewall defence operations.

Content threat-aware TI2Fs
The TI2F is designed based on the hierarchical fog model proposed in [34] as shown in Fig. 3. For packets sent from terminals, they are firstly routed to a TI2F in monitoring layer, and parsed into preprocessing layer to extract components. Interest packets are then analysed in the policy layer to generate customised filter policies for both response and itself. The filter layer conducts block with inherent and customised filter policies. The security layer attaches customised policies to interest, thus policies could be reused when packets are cached in other nodes. These policies are retrieved by fog computing firewalls in the monitoring layer. The routing layer then forwards the new packet to the next hop.
The monitoring layer is to retrieve and then record all the communication behaviours of each information-centric IoV nodes. Monitored objects include terminals, network topology, activities and resources. Therefore, TI2Fs have access to all the requests, responses and device attributes of each information-centric IoV node, which helps perceive user behaviour and context traffic. Data obtained in the monitoring layer is stored in two databases. Device attributes and location information are placed in the network database. The communication activities are recorded in the log database.
The preprocessing layer parses packets and extracts the components of each packet. Components extracted from interest packets include packet names, selectors and ForwardingHint [37]. The selectors refer to specified publisher keys and excluding information for names of responses. And ForwardingHint represents an authorisation list of forwarding objects. For data packets, components of name, signature, sighed information and content are extracted into the database.
The policy layer configures customised filtering policies for distinct packets dynamically. It consists of three modules: correlated context analysis, semantic reasoning and userconfigured policy extraction. For interest packets, correlated context is selected and relevance weight matrix is calculated to guide the reasoning direction. Then the policy layer conducts semantic reasoning between request names and blacklists. Thus the reasoning constructs security knowledge policy and finds the related new entities to expand the blacklist. Moreover, userconfigured filtering policies are generated by combining extracted packet components. Selector components are used to restrict producer identities and to exclude packet names while ForwardingHint is used to control packet transmitting. All policies for one specific interest packet are called interest-specified policies (ISPs).
In the filtering layer, for interest packets, integrity and validity are checked to prevent requests and ISPs being tampered by malicious nodes. Besides, content matching based on the blacklist is conducted to prevent anomaly requests. While for data packets received, integrity and validity authority, user-configured policy matching and content filter are implemented orderly. The difference between content filter of interest and data packets is the processed objects. Only the content names in interests are filtered for malicious request detection whereas the whole content in data packets are filtered to prevent illegal, unexpected, obfuscated and falsity content attacking.
The security layer consists of two modules: a tagging module and an encryption module. The tagging module adds a new ISP tag in interest packets from the policy layer and expands the blacklist with reasoned threatened entities. Thus, other information-centric IoV nodes can acquire ISPs when caching the interest. Related TI2Fs can obtain ISPs in the monitoring layer as well and place them in the blacklist database. The encryption module conducts encryption for a tagged interest. The routing layer then forwards new packets to the next hop into information-centric IoV.

Content threat-aware NI2Fs
A NI2F is designed to prevent data leakage from a protected Intranet and Extranet unauthorised access, as shown in Fig. 4. Unlike the TI2F, there is no security layer in network firewalls. Interest packets routed to the network boundary have already been parsed in their TI2Fs to generate ISPs.
The monitoring layer acquires packets from both Intranet and Extranet. They are parsed similarly with that in terminal firewalls. The policy layer contains the communication record analysis and semantic reasoning module. When receiving an interest packet, related communication content is calculated and fetched from the packet database. Then semantic reasoning of the blacklist, interest names and previous communication records are conducted to find possible potential illegal entities. The filtering layer blocks Extranet requests beyond the access authority. Besides, data packets sent to Extranet are detected with blacklists for information leakage. The routing layer then forwards packets into informationcentric IoV.

Defence workflow of fog-firewall system
The workflow of our proposed firewall system is shown in Fig. 5. For interest packets, requests are firstly acquired by TI2Fs before entering information-centric IoV. TI2Fs generate ISPs for the pending requests and responded data. Anomaly requests are blocked here and legal requests are tagged with ISPs. On the forwarding path, relay nodes that record the interests in PIT will be perceived by its TI2Fs. Consequently, TI2Fs acquire ISPs from the relay nodes whereas interest packets themselves need not to be acquired or parsed by firewalls again, which saves the transmission and process delay.
When a producer sends a data packet as the response, data will firstly be acquired by a TI2F in the jurisdiction. With previous stored ISPs, the TI2F can conduct filtering on data packets and decide whether to block or allow it to enter information-centric IoV. Similarly, with the case of interest packets, no more TI2Fs will inspect the data packet on the forwarding path for data in the information-centric IoV has inspected on the entry.
In the case of internetwork communication, when an interest packet is routed to another private net, the NI2F conducts access authority to exclude illegal access. For data packets, the Intranet physical gateway sends it to the NI2F inspecting information leakage. When routed to the requester's private network, access authority control is conducted again to avoid malicious node sending illegal or unexpected content.

Interest semantic reasoning for threat-aware policy customisation of proposed firewalls
We first design a context selection algorithm: Communication context correlated with the pending packet is collected according to content similarity and the relevance weights are calculated. And two reasoning algorithms of knowledge graph are modified and applied to reason the potential relations and entities between requests and entities in blacklists.

Threat-aware direction guidance with context
The packet database in a TI2F where all the communication packets are stored can hold N records for each covered terminal and N is adjusted with calculation ability of each fog node. The database adopts first-in-first-out principle. All the packets in the database are parsed into key entities upon name, selector and the ForwardingHint components as well as the data content in the preprocessing layer. The Jaccard similarity coefficient is introduced to measure the similarities between context communication content and the pending packet. Here we assume there are M parsed entities. The packet entity matrix S M × N E = s i j e is established to describe the relation between the entities and packets, where if entity e i appears in packet p j 0, others For the pending packet p 1 , the Jaccard similarity coefficient of p 1 and context packet p j are calculated as J 1 j , j varies from 2 to N and S ⋅ j E represents the jth column in S M × N E .
Then the relevance value of p 1 and p j is normalised as We select top K packets with highest relevance weight as the correlated context. The relevance matrix R 1 × K = r 1 j p is established with the relevance weight r 1 j p of selected packets in descending order. Entities of the selected packets form another entity matrix The number of K varies with the average degree of context relevance and the calculation ability of fog. A higher overall relevance of context with the pending packet leads to more correlated packets and their key entities to be selected for reasoning. And a stronger fog node can deal with more correlated packets as well. All the key entities of the selected context form a new collection E c . The relevance weight r i c of a selected entity to be reasoned with blacklists is . Therefore, we select top K correlated packets and the relevance weight r i c of their key entities, which are related with the request content.

Interest semantic reasoning for security knowledge policy
Semantic reasoning included two aspects: To reason the potential threatening relation between the request and entities in blacklists and to mine the sensitive entities that can composite a multiple-step relation between the requests and blacklists.
To mine possible threatened relations between requests and blacklists, we introduced and modified the TransA model, which utilises the metric learning ideas to learn the relation between entities in knowledge graph [38]. TransA proposes a loss metric with a score function (6) to estimate relations of entity pairs and a weight matrix to weight specific feature dimensions during reasoning training, where e h , r, e t refer to the head entity, relation and tail entity in the knowledge graph We modified the weight matrix W r with the semantics of correlated context in (7). Δ is a dataset comprised of existing triples e h , r, e t in knowledge base (KB), Δ′ denotes the negative samples constructed with false triples. When the selected training triple contains the entity in context collection W c , the relevance value is applied in W r to reduce the score (6) where A = e −r i c , if e h (′) or e t (′) ∈ W c 1, others We took KB triples as learning samples to optimise parameters in loss metric. The loss function value continuously reduced through learning, and the entity vector and relation matrix can better reflect the semantic information of entities and relations. Relations with the similar semantics with the context and requests were learnt as a result.
We modified the PTransE model to find the potential sensitive entities to composite a relation between requests and blacklists, which are likely to be the obfuscating objects in the responded data packets. PTransE takes multiple-step relation paths into consideration for representation learning [39], the score function of multiple steps in loss metric is defined in (9). We trained the model with the same datasets of TransA to optimise the loss metric thus obtaining multi- To guide the relation path direction, we modified the resource function R p m e h , e t which measures the resource flowed from e h to e t as the path reliability in (10). When the middle entity e i − 1 ∈ W c , the resource allocated to the next entity is weakened by the relevance weight resulting in a lower loss function in learning.
And A is defined in (8) where Therefore, for each request, the blacklists are expanded with the reasoned middle entities. A response will be blocked when the middle entities are detected in the packet content.

Simulation and analysis
In this section, we simulate the proposed fog computing based information-centric IoV firewall system and semantic reasoning based threat awareness with ndnSIM. ndnSIM is a NS-3 based NDN simulator (version ndnSIM 2.2). NS-3 (Network Simulator V.3) is a simulating software of discrete events for the NDN research which has granted a license of GNU GPLv2 [40]. Simulations on cache hits, defence accuracy and end to end communication delay are conducted. And the results show that our proposed fog firewall performs well in terms of scalability, defence accuracy and efficiency.

Scalability evaluation of threat-aware fog-firewall
In the information-centric IoV, there are always new vehicles and pedestrians accessing and leaving the vehicle network casually with the change of time-domain. Thus, to verify the scalable capability of the proposed fog computing based information-centric IoV firewall system as well as the semantic reasoning based threat awareness schema, the cache hit of ICN nodes is evaluated with different network scales. The reason why we choose cache hits as the evaluation demission is because in information-centric IoV, ICN nodes will store the forwarding content into their cache. Therefore, as the network scale varies, requests and response content will increase as well. Thus in an ideal occasion that no content will be removed from the network, the average cache hits will rise as the requests increases because more and more content is subscribed before and is stored in the cache. However, in our proposed fog-firewall equipped information-centric IoV, firewalls mined sensitive threats and conduct blocking, therefore, content stored in cache cannot match the later requests as it does in the firewall-free IoV. In this circumstance, we evaluate the average number of cache hits of both the whole routers in a fog-firewall information-centric IoV and a fog firewall-free information-centric IoV. Fig. 6 shows the performance on the validity and scalability of the proposed dynamically configured fog-firewall. At the beginning of the simulation, there are 40 pedestrians and vehicles in total in the network. Each of them requests content with casually names and different sizes, and responses as need. The number of terminals in the simulation increased periodically from 40 to 400, while we keep the packets transmission rate remaining the same. As the number of network nodes rises, cache hits of our proposed firewall-equipped information-centric IoV are lower and grow slower than that of the firewall-free information-centric IoV. This is because anomaly requests, illegal and unexpected data are blocked on the firewall so that they cannot entry information-centric IoV or to be cached in routers. And the results prove that our proposed fog firewall model as well as semantic reasoning algorithm can filter sensitive threat content outside the network. Besides, our model performs well as the network scale varies.

Defence accuracy evaluation of threat-aware fog-firewall
It is important for the firewall to perceive, mine and then defend both explicit and inherent threat information. To evaluate the defence accuracy of our model, we define a variable which is Unknown Threat Rates (UTRs).

UTR =
E h E h + E r + E s (12) And E h , E t present the entities of Knowledge Graph. Here we choose the WN18 which is a dataset WordNet as the Knowledge Graph. We randomly select 50 triples in WN18, including a head entity E h , a tail entity E t . And then all the relation attributes which connect the two entities, are mined using our semantic reasoning algorithm. The tail entity collection E t is listed in blacklists as known threats and the heads E h act as implicit threats which should be mined and blocked by reasoning. In addition, we utilises Word2Vec to train synonyms or similar entities E s of the head E h from Wikicorpus and they act as correlated context to guide the reasoning from E h to E t . Defence accuracy of the proposed firewall with semantic reasoning is simulated. We evaluated defence accuracies with different UTRs which are 30, 50, and 70%. Nodes in the simulated information-centric IoV randomly request content in the three mentioned collections. Simulations of each network topology were conducted 100 times and then we calculate the average accuracy. As shown in Fig. 7, firewalls with semantic reasoning always perform higher defence accuracy than those whose blacklists are fixed with E t only as information-centric IoV expands. The accuracy rises because with more correlated context sent to firewalls, more E h as well as E s are mined to be relevant with E t and thus are blocked, which shows the proposed semantic reasoning performed well in perceiving threats.

Efficiency evaluation of threat-aware fog-firewall
When the firewall conducts analysis and blocking, the network transmission delay will get influenced. This is because a fog firewall needs to cover more than one terminal, and when the traffic becomes extremely enormous, content needs to wait in line for getting filtered. Therefore, we evaluate the processing efficiency of our defence isolation provided by the fog-firewall model.
We choose end-to-end communication delay to evaluate the influence that the firewall produced on the information-centric IoV communication transmission. The end to end communication delay is calculated from the requester send the package to the responder receive it, which includes the transmission delay between each routing nodes and the processing delay both firewalls and routing nodes. We simulate two scenarios that firewalls are deployed on fog nodes and on terminals directly. For both scenarios, two networks are designed with the same topology, requests and nodes. Therefore, the influence of processing delay on routers can be neglected because the routing time is the same for both scenarios. By changing the frequency of terminals requesting content, we can get different delays. The average communication delay has been counted, and a comparison is shown in Fig. 8. As the interest request rate rises, fog-based firewalls present higher processing efficiency than firewalls on terminals. The average delay in fogfirewalls increases more slowly and is almost lower than that of terminal-deployed firewalls. This is because although there might be queuing phenomenon in fog firewall, yet fog nodes provide crucial computation and awareness resources to help firewalls perceive and predict user behaviour, reason defence policies and conduct blocking while general terminals cannot support light weight but accurate defence especially when the network traffic becomes huge. In addition, firewalls deployed on terminals cannot perceive some threats that have been found by nearby nodes, which waste repeated time to analyse.

Conclusion
To construct an information-centric IoV firewall system defending against potential threats of content with semantic ambiguity, we proposed a threat-aware fog-firewall mechanism supporting content-orient semantic reasoning. It can prevent the illegal content entering information-centric IoV and unexpected access between different private networks. An interest semantic reasoning approach with context was proposed to mine potential relations and threatened knowledge from interest names and contents. The simulation results showed that the proposed fog based informationcentric IoV firewall system could provide scalable, accurate and efficient isolation defence. This work is significant to improve the security of information-centric IoV.

Acknowledgments
This work was supported by the National Natural Science Foundation of China (grant nos. 61431008 and 61571300).