Efficient key management scheme for health blockchain

: Healthcare is a big application scenario of blockchain, and blockchains used in healthcare are called health blockchain. In general, blockchain blocks are open and the transactions in them are public. If some privacy data are involved in these transactions, they will be leaked. Owing to healthcare system involving a great deal of privacy data, certain security mechanisms must be built to protect these privacy data in health blockchain. Furthermore, because the core of security mechanisms is the key management schemes, the appropriate key management schemes should be designed before blockchains can be used in healthcare system. Here, according to the features of health blockchain, the authors use a body sensor network to design a lightweight backup and efficient recovery scheme for keys of health blockchain. The authors ’ analyses show that the scheme has high security and performance, and it can be used to protect privacy messages on health blockchain effectively and to promote the application of health blockchain.


Introduction
Blockchain was first introduced in bitcoin and is the supporting technology of bitcoin [1]. It uses technologies such as consensus mechanism [2], digital signature, and hash chains to record bitcoins' transactions by building a distributed, shared database in a decentralised manner. These technologies provide security services such as non-repudiation, integrity, distributed storage, time-based traceability for transaction contents, which form a robustness system and make bitcoins circulate cross the Internet freely to realise the value migration in untrusted networks. Later, people gradually realise that blockchain can be used in various fields such as healthcare, fintech, computational law, audit, notarisation, and so on by designing various smart contracts based on blockchain. The use of blockchain can greatly improve the efficiency and the security of transactions' processing, and reduce the cost [3][4][5]. At present, it is widely believed that consensus in untrusted networks, robustness, and value migration in a decentralisation manner are the main features of blockchain. Based on these features, we can predict that blockchain will update the current information-Internet to value-Internet in the future, and thus dramatically change the mode of our society and our life. Fig. 1 shows the architecture of blockchain.
However, as a developing technology, blockchain still is facing some problems. For example, how to improve the speed of transaction recording? How to improve the efficiency of consensus? How to protect privacy data on blockchains? All of these problems have effects on the popularisation and application of blockchain. Among these problems, the third one is the most highlighted one, as many applications of blockchain concern about privacy data and people are becoming more and more concerned about privacy issues.
The core of the third problem is how to build a feasible key management scheme for blockchain. Blocks on blockchain are public and shared by all participants. When these blocks are involved with privacy data, it is necessary to encrypt these data to protect privacy information. However, the key management scheme related to the privacy protection is hard to design. One key for all blocks is unfeasible, the encrypted blocks will be vulnerable under the statistical attack. Yet, one key for one block is unfeasible, as it will require a high cost for storing and recovering a tremendous number of historical keys. At the same time, designing key management scheme for blockchain has to consider the application scenarios, for different scenarios have different features and it is hard to design a generic key management scheme for all scenarios.
At present, it is a consensus that blockchain has great potential application values in the field of healthcare. However, because blocks on health blockchain are involved in a great number of private health data, it is necessary to solve the problem of privacy protection before the popularisation of health blockchain.
With the development of electronic techniques, body sensor networks (BSNs) emerge to survey the health of the human. Biosensor nodes in BSNs can be deployed on/into the human body to collect physiological signals and send these signals to a remote hospital for further processing. To protect physiological signals from the human body, many researches have been done in designing the key management for BSNs. In the paper, we merge BSNs and health blockchain together, and make use of the idea of designing the key management scheme for BSNs to design a lightweight backup and efficient recovery scheme for keys of health blockchain.
The rest of the paper is organised as follows. Section 2 presents the existing research results related to key management schemes for blockchains. Section 3 proposes a lightweight backup and efficient recovery scheme for keys of health blockchain. The performance and security analyses are given in Section 4. In Section 5, conclusions are drawn.

Related work
As a supporting technology for bitcoin, the blockchain is known by the public with the popularisation of bitcoin. Later, people find that the blockchain has broad application space in the field of healthcare, fintech, law, energy, and so on by designing various smart contracts. As most application scenarios involve the storage of privacy data, before applying blockchain on these fields, the blockchain must solve the problem of privacy protection.
To address the problem, research in [6] proposes a scheme using blockchain to protect personal data, and the scheme ensures users own and control their data. However, the scheme focuses on the construction of blockchain and how to authenticate the access to blockchain, but does not give a concrete solution to design the related key management scheme. A study in [7] proposes a method that uses bitcoins and the blockchain to solve subjective trust and quantification of trust in pretty good privacy (PGP) mechanism, and gives a solution to store and use the PGP certificates. However, the research does not concern the key management related to blockchain encryption. A study in [8] proposes a method using blockchain to authenticate the decentralised sensor data. The scheme makes use of timestamp, hash function, and the mechanism of work proof to check the validity of sensor data, and does not consider the confidentiality of sensor data, so the scheme does not concern the research of key management. Wang et al. [ 9] propose a data security sharing network architecture based on block chain to support exchange information for internal and intel-enterprise. However, the work also ignores the confidentiality of data and does not concern the key management on blockchain. Zhu et al. [ 10] point out using asymmetric encryption and multi-blind signature to realise privacy protection, while the two technologies are not available to protect a great deal of data because of their low efficiency.
It can be seen that at present, the blockchain is a developing technology, and the security research on it is only in an initial stage with few work being done on this field.
3 Lightweight backup and efficient recovery scheme for keys of the health blockchain

Application scenario of health blockchain
In this section, we first present the general application scenario of health blockchain.
Recently, the emergence of BSNs greatly promotes the development of smart healthcare industry. A BSN is composed of tens of biosensor nodes that are deployed on or into the human body [11]. These nodes are equipped with various biosensors that can collect the physiological signals such as blood pressure (systolic and diastolic), electrocardiogram, blood oxygen level (SpO2), photoplethysmogram (PPG) signals, and so on. In addition, they also are equipped with wireless network chips, and these chips not only help biosensor nodes form a BSN, but also help these nodes sending collected physiological signals to a special relay node (generally called PDA) that takes charge of merging and forwarding signals to a remote medical centre such as a hospital [12]. Thus, with the help of BSNs, a person can easily build his/her health file and a hospital can easily acquire the comprehensive health condition of a patient before treatment. It can be seen that BSNs can significantly improve the current medical environment and open the door for the smart healthcare time.
In the traditional application scenario of BSNs, a user's physiological signals generally only are sent to one target hospital. The hospital stores these data, and when the user needs to use these data for health purpose, or the doctor related to the user needs to use these data for medical purpose, the hospital will draw and analyse these data, and send the analysis results to the user or the doctor. However, the scenario has some problems: (i) monopoly problem: concentrating users' physiological signals in one target hospital will cause the monopoly of medical data. When the user goes to other hospitals for treatment, the hospital storing the user's physiological data generally is reluctant to share these data to other hospitals for the sake of interests. (ii) Vulnerability problem: storing physiological data in one hospital has vulnerability, and an accident will cause the loss of a user's physiological data. (iii) Privacy problem: the target hospital may deliver users' physiological data to insurance companies, medical companies, and so on for commercial purposes without user knowledge, which will violate users' privacy. (iv) Integrity problem: when a medical dispute occurs, a hospital may tamper with the medical data stored in the storage device owned by itself, so that the hospital will be in a favourable position in the dispute.
When we use blockchain in the healthcare system, and merge blockchain and BSNs together, the above problems will be solved thoroughly. Fig. 2 shows a smart healthcare system with blockchains and a BSN.
In Fig. 2, a smart healthcare system consists of a BSN and two health blockchains, A-chain and B-chain. The BSN is deployed on a user's body and is composed of a few wearable nodes. One of them is the gateway device and some are implanted nodes. Wearable nodes and implanted nodes are used to measure the user's various physiological signals and send these signals to the gateway node. The gateway device takes charge of converging these physiological signals and broadcast the related physiological data to some target hospitals. These target hospitals form a healthcare alliance and each of them provides a blockchain node (a computer server). All blockchain nodes make uses of consensus mechanism, digital signature, and hash chain technology to maintain the two health blockchains. When these blockchain nodes receive a broadcast physiological message from the gateway device, they use consensus mechanism to check its validity. Also, once the message is checked valid, blockchain nodes will put it on the A-chain by signing it. Otherwise, if the message is checked invalid or some nodes vote against the message, blockchain nodes will reject the message according to their consensus mechanism, and put the messages and the cause on the B-chain.
In the scenario, each blockchain node stores a duplicate of each blockchain. When the user goes to any hospital in the health alliance, the visited hospital can draw the user's physiological data  from A-chain on its blockchain node. It solves the monopoly problem of users' physiological data. In addition, it solves the vulnerability and integrity problems of physiological data storage by multi-nodes' backup and nodes signatures.
To solve the problem of privacy, we adopt the idea of key management for BSNs to improve the broadcast process in the smart healthcare system as follows: before a biosensor node sends the target physiological signals to the gateway device, the node first produces a key using other physiological signals it measures, and uses the key to encrypt the target signals. Next, the node sends the encrypted signals to the health blockchain with the help of the gateway device. As the blockchain nodes and the corresponding hospitals do not know the encrypting key, they cannot leak users' private physiological data to other organisations. When a user wants to recover his/her physiological data from the health blockchain, the user can ask his/her biosensor node to recover the encrypting key and then use the key to restore his/her physiological data.
It could be seen that in the scheme, the health blockchain only stores the cipher text of physiological data and the power of decrypting these data is controlled by the user. In other words, the user controls who can access his/her physiological data. It will solve the privacy problem.
In the realisation, we use fuzzy vault technology to carry out the generation, backup, and recovery of keys of health blockchain. Also, to explain the scheme clearly, we first give the detail of fuzzy vault in Section 3.2.

Fuzzy vault
Fuzzy vault is a cryptographic primitive, and it can use a set A to build a structure denoted by vault to hide a secret S. S could be unhidden if another set B is similar enough to the set A. Based on fuzzy vault, research in [13] proposed a key management scheme called Photoplethysmogram based key agreement (PKA) that uses PPG signals to negotiate a common key between two biosensor nodes. The scheme including five steps: (i) Production of PPG vector. Under a loose synchronisation mechanism, biosensor nodes A and B on the same human body collect PPG signals, and then both of them use fast Fourier transform (FFT) to encode these signals into vectors: .., f a s l and F r = kf 1 r , f 2 r , ..., f a r l.
(ii) Creating polynomial. Biosensor node A creates a polynomial p(x) with a public order a. Also, the coefficients are produced from a random number and are used to be encoded into a common key. For instance, if the coefficients are e a , e a−1 ,…, e 1 , e 0 , the common key will be K = e a ||e a−1 ||,…,| | e 1 ||e 0 . (v) Opening vault. When B receives the vault R, it draws a points set U from R, where the x ordinates of points in U are elements in Fr. Next, B tries to reproduce the polynomial p based on points in U by Lagrangian Interpolation. If B could produce a polynomial p ′ , it will use the coefficients of p ′ to produce a key K ′ as mentioned in the first step. Finally, B use K ′ to check validity of the MAC T(K, R) it receives. If the MAC is valid, it means that A and B share the common key K successfully, otherwise A and B will restart the key negotiation process.
A later study [14] found that PKA left some practical problems unsolved. For instance: (i) PKA requires that biosensor nodes A and B share at least v + 1 feature points to reproduce a vth-order polynomial, but it is not a goal easy to reach. (ii) Some important parameters are inversely correlated. Namely, when the length of the common key is determined, the order v of the produced polynomial p(x) and the average length of each coefficient of the polynomial e are inverse correlated. While from the perspective of security, both v and e are all required to be large enough to resist brute-forcing attack.
To solve these problems, a study in [14] proposed an improved scheme for PKA: when biosensor nodes A and B need to negotiate a common key, A first generates a key material and encodes it into RS (Reed-Solomon) codewords using RS code. Also, then these codewords are encoded as the coefficients of the chosen polynomial p(x) with the order v. Next, A collects physiological signals and uses FFT to encode these signals into a feature vector. Finally, A inputs the feature vector into the polynomial p(x), and uses the produced points on p(x) and a chaff points set to build a vault. To B, when it receives a vault from A, it uses a reconstruction method called lower-order twice reconstruction (LOTR) to reproduce p(x). In other words, in the beginning, B maybe cannot find enough matched physiological signals with A to construct a v-order polynomial. In this condition, B can use a small number of matched physiological signals to build a lower-order polynomial. Next, B estimates the left points in p(x) according to the lower-order polynomial. Finally, B recovers p(x) using the matched points and the estimated points. After LOTR process, if B obtains the coefficients of p(x), it can calculate the key material using RS code.
In the improved scheme, A and B do not need as many matched points as the research [13] to reconstruct p(x), which solves the first problem of PKA scheme. Besides, because RS code is used in q-ary field, when we use RS code as the coefficients of p(x), the bit length of each coefficient of p(x)i safixed length q, which breaks the inverse correlation between the length of coefficient and the order of p(x), and solves the second problem to some extent.
Since the improved PKA scheme is superior to the original one in terms of security, in Section 3.3, we use the improved PKA scheme to design the keys' generation, backup, and recovery scheme for health blockchain.

Keys' generation and lightweight backup scheme for health blockchain
In the scheme, key generation process is designed as follows: (i) In the initialisation period of a BSN, some biosensor nodes measuring PPG signals are appointed to generate encrypting keys for health blockchain. Here, we suppose that biosensor node A works as the role.
(ii) When the gateway device needs to encrypt a physiological data, it asks A to produce a key for health blockchain. Once A receives the order from the gateway device, it first generates a pre-key K = k a ||k a−1 ||,…,||k 1 ||k 0 , and then uses RS code to encode k i (0 ≤ i ≤ a) into codewords e i with the length of q. Finally, A uses e i as coefficients to construct a polynomial p(x) with the order a. (iii) A communicates with adjacent biosensor nodes measuring PPG signals to find a group of the same PPG signals to get a stable signals set. Also, then A encodes these signals into a vector F s using FFT. Next, A puts F s into p(x) to calculate some points on p(x). These points form a set D, and in order to protect D, A generates a chaos set C, and then mixes D and C to form the set R=D∪C as the vault. (iv) A chooses a pseudo-random function F(·) and calculates K*=F (k*,K) as the encrypting key for the health blockchain. Here, k*i s the pre-distributed key in all of biosensor nodes. At the same time, A generates a random number r, and uses r and k* to hide the vault R: M = E(r⊕k*,R), where E() is a symmetric encryption algorithm, ⊕ the XOR operation.
(v) A sends K*||M||r||H(K*)||ID A to the gateway device G in a secure manner using the security association between A and G.
Here, H() is a hash function, symbol '||' denotes concatenation operation; ID A is the identity of A.
(vi) G uses K* to encrypt the physiological data Mp, and broadcasts the encrypted Mp and M||r||H(K*)||ID A ||B A among blockchain nodes to let them put these messages on the health blockchain. Here B A denotes the index of the block and includes the information such as which biosensor node generating K*, the time of generating the block, what kind of physiological data being in the block, and so on. Finally, for the sake of security, G will delete K*.
In addition, there should be authentication mechanisms among biosensor nodes, gateway devices, and blockchain nodes to realise entities authentication and message authentication in steps v and vi. Similar researches were proposed in [15][16][17], while authentication mechanisms in them are not available to the scenario in the paper. So, in subpart E, we give our design. Fig. 3 shows the encrypted block on the health blockchain.

Efficient recovering keys of health blockchain
When the user needs to decrypt his/her physiological data and authorises a target hospital or a target doctor to use them for treatment, he/she can execute the following process.
(i) The user uses his/her gateway to point out which encrypted block on the health blockchain will be decrypted. Also, then G searches the corresponding encrypted block on the health blockchain by the index B A . When it finds the block, it will send the related M||r||H(K*) to the biosensor node A.
(ii) Once A receives M||r||H(K*), it uses r and pre-distributed k 0 to decrypt M to get the set R. Also then, A communicates with the adjacent biosensor nodes measuring PPG signals to obtain a stable set of PPG signals. Next, A uses FFT to encode the set of PPG signals into the vector F r and then draws the points set U from R, where the x ordinates of points in U are the elements in F r . (iii) If the set U has enough points, node A can directly use Lagrange's interpolation to build a polynomial with the order a. While in most cases, due to physiological noises, the elements in U are not enough to build a polynomial with the order a, and in this condition, A can adopt the LOTR method, that is to say, A first constructs a lower-order polynomial pl(x), and next, A estimates the left points in p(x) according to the polynomial pl(x). Finally, A recovers p(x) using the matched points and the estimated points.
(iv) To verify the validity of the recovered polynomial, A first decodes its coefficients by RS code and uses the decoded results to form a key K* ′ . Also then, A checks whether H(K* ′ )=H(K*). If they are the same value, it means that p(x) is recovered successfully, and otherwise A repeats the recovery process.
After A recovers the key K*, it sends K* to G by the security association between A and G. Also then, G will decrypt the physiological data by K*, and authorise other entities visiting them by other security mechanisms. Finally, G deletes K* for the sake of security.

Authentication mechanisms
In the scenario, blockchain nodes cannot identify the authenticity of physiological data delivered by gateway devices. If these data come from illegal gateway devices or illegal biosensor nodes, it will disturb the normal operation of the whole system. Based on the consideration, the whole system requires gateway devices authenticating biosensor nodes, and blockchain nodes authenticating gateway devices. According to the requirement, we propose the following authentication mechanisms.
We implant the authentication mechanism in steps v and vi of subpart C if node A is an adversary and operates as the description in steps i-iv. It will exposure its identity under the authentication mechanism in step v, which means it consumes its energy unnecessarily in steps i-iv.
In step v, the gateway device G needs to ensure the received messages coming from a legal biosensor node. Thus, biosensor node A will change the messages in step v into the following form: After G decrypts K* using the pre-distributed key k*and r, and verifies K* using H(K*), it can ensure that M A comes from a legal biosensor node because only legal biosensor nodes have the pre-distributed key k*.
In step vi, G should use digital signature algorithm to sign messages using its private key before broadcasting these messages. Thus, blockchain nodes will know which gateway device broadcasts. Suppose, Mp denotes the physiological signals and Mg = M||r||H(K*)||ID A ||B A , and then the broadcasted messages have the following form: Here, Sig(.) denotes the signature algorithm and Kpg the private key of gateway device.
After each blockchain node receives the message, it can verify the message by using the public key of G. According to the consensus mechanism, if blockchain nodes believe G is a legal device and the signature is valid, they will put E(K*,Mp)||Mg on the A-chain.
If G is illegal device or the message's signature send by G is invalid, blockchain nodes will put Sig(Kpg,H(E(K*,Mp)||Mg))|| Mw||MSig(Sig(Kpg,H(E(K*,Mp)||Mg))||Mw) on the B-chain to record the accident. Here, Msig(.) denotes a broadcasting multidigital signature scheme executed by blockchain nodes [18]; Mw records the broadcasting time of physiological messages and the reason of rejection. For the sake of saving storage, B-chain need not to store (E(K*,Mp)||Mg). When G asks question about the rejection, it can inquire B-chain by clues: Sig(Kpg,H(E(K*,Mp)||Mg)) and the messages' broadcasting time.
The reason that biosensor nodes do not use digital signature mechanism is that signature algorithm is a high-energy operation, and energy in biosensor nodes is little.

Authentication mechanisms
In the smart healthcare system, K* and the private physiological data are the objects being protected and they also are the objects the adversary wants to attack. In the following, we analyse the security of the healthcare system by attack method of the adversary.
Generally, the adversary has two possible attack ways to obtain K* and the private physiological data. One is attacking the health blockchain, and the other is attacking the BSN.

Attacking the health blockchain:
If the adversary wants to launch an attack to a special physiological data or an encrypting key from the health blockchain, he will first draw the encrypted block according to ID A and B A from the health blockchain.
However as shown in Fig. 3, the only information about K* the adversary can get from the encrypted block is H (K*). Owing to the one-way feature of hash function, the adversary cannot recover K* from H (K*). Maybe the adversary wants to get some information about K* from M. However, M is an encrypted result of a symmetric encryption algorithm, and when we use AES or 3DES as the algorithm instance, the adversary hardly has chance to decrypt M. Though the adversary is fortunate enough to get the vault R from M by some method, he still does not know which true points take part in the generation of K*, for R is a mixed set of true points set D and a chaos set C.
As the private physiological data are protected by K*, in the condition that the adversary does not K*, he cannot obtain the private physiological data yet.

Attacking the BSN:
Since K* is produced by the BSN, the other way that the adversary can obtain K* or the private physiological data is attacking the BSN.
However, the biosensor nodes in BSN are deployed on or into the human body and under the surveillance of the user all the time, the adversary has little chance to touch these biosensor nodes and draw physiological signals from them.
In the worst case, maybe a user's BSN has more wearable biosensor nodes that can measure PPG signals, and the adversary can touch some of these nodes in some condition. In this case, we can increase the order of the polynomial used to produce the pre-key. Since we use the improved PKA scheme to produce pre-key, when we increase the order, the length of the polynomial's coefficients will not be reduced, which will increase the security of the smart healthcare system.

Performance mechanisms
The main advantage of the proposed scheme is the storage of keys. In general methods, in order to resist the statistical attack, the health blockchain has to change the encryption keys frequently, and it will cause the generation of a great amount of historic keys. These historic keys must be well stored and indexed; thus, when a user wants to decrypt a block, the healthcare system could find the corresponding keys quickly. In the case, the storage cost will be great.
In our scheme, the health blockchain does not need to store an encrypting key but a clue to a key. The recovery of the key is executed by the BSN. It will greatly reduce the storage cost.
In addition, the clue of encrypting keys is with the encrypted block, so the healthcare system does not need to search the related keys. It will improve the efficiency of decrypting block.

Conclusion
The health blockchain is a good solution to address the problem of monopoly of physiological data and improve the robustness of storing these data, and has a broad application prospect in the area of healthcare system. However, before the popularisation of the health blockchain, we must address the problem of protecting private physiological data. The core problem is designing an effective key management scheme.
In the paper, we merge the BSN and the health blockchain, and use the biosensor nodes in the BSN to propose a lightweight backup and efficient recovery scheme for keys of health blockchain. The scheme has the following advantages: (i) biosensor nodes in the BSN are in charge of generation, backup, and recovery of the keys of health blockchain, and it will increase the security of these keys. (ii) In the scheme, each block on the blockchain can be encrypted by a distinguished key with lower storage cost and high performance, and it will greatly improve the security of privacy physiological data on the health blockchain.