We propose a comprehensive research framework for evaluating the GIE of China’s high-tech industry and analyzing their spatial characteristics (Fig. 1). Beginning with defining GIE in the high-tech industry, this study selects input-output indicators and technological environmental variables, employing the three-stage undesirable SBM model for calculation. The Theil index is used to analyze the spatial differentiation characteristics of GIE, while the Moran index and Standard Deviation Ellipse are utilized to analyze the spatial clustering features. Additionally, the Spatial Markov Chain and β-convergence model are applied to analyze the spatial convergence characteristics.
Variable selection
The selection of variables for measuring GIE in the high-tech industry requires a robust theoretical foundation. Input variables must capture essential resources driving innovation and the green transition. Key input dimensions include research institutions, personnel, funding, and transformation resources. Output variables should reflect multiple innovation outcomes, including technological achievements, economic benefits, and environmental consequences. This captures both desirable and undesirable results of innovation2,7. The technological environment significantly shapes innovation effectiveness5,6. Variables representing technological importation, learning, transformation, and support are therefore critical. These capture external and internal mechanisms enabling an industry’s innovation capacity and shift toward sustainable practices, summarized in Table 1.
Input variable
The count of research and development institutions measures the number of research institutions. These institutions are fundamental to technological and scientific progress22. This count quantifies the availability of specialized resources, knowledge infrastructure, and potential collaboration opportunities within the high-tech sector. R&D institutions serve as central hubs for cutting-edge research, advancing green technologies through partnerships across academia, government, and industry. Their regional or sectoral density directly indicates innovation capacity, particularly in domains critical for green transitions like energy efficiency and pollution control. Consequently, the R&D institution count is vital for assessing resources committed to green innovation.
The full-time equivalent of R&D personnel measures research personnel input. Human capital critically drives technological innovation23. This metric quantifies the workforce dedicated to research, reflecting human resource intensity in innovation. Skilled personnel are indispensable for translating concepts into viable green technologies. Quantifying the R&D workforce signifies available labor capacity to propel innovation and directly builds the knowledge base essential for high GIE. Greater R&D personnel concentration increases the potential for significant green technological breakthroughs.
R&D expenditure represents research funding. Financial resources allocated to R&D are paramount for sustaining innovation24. This expenditure quantifies financial support accessible to the high-tech industry for research, testing ideas, and developing green technologies. Investment levels signal public and private sector commitment to innovation, particularly in environmentally sustainable technologies. Higher R&D expenditure typically correlates with enhanced capacity for breakthrough green innovations. This funding underpins creating energy-efficient products, environmentally friendly production methods, and sustainable solutions, substantially influencing overall GIE.
Expenditure on new product development measures transformation funding. This funding, directly linked to commercializing innovations, represents resources directed toward converting research outputs into market-ready, environmentally sustainable products25. The high-tech industry requires significant investment not only in research but also in scaling and commercializing green innovations. This expenditure ensures research breakthroughs become sustainable solutions adopted by industries and consumers. It reflects the critical capacity to translate theoretical knowledge into tangible applications, necessary for achieving high GIE.
Output variable
The number of effective invention patents measures technological achievement. This achievement is a primary indicator of innovation success and impact in the high-tech sector26. The count of effective invention patents represents technological progress, as patents attest to the novelty and originality of inventions. The volume of in-force patents directly connects to developing new green technologies that enhance industrial sustainability. Patents provide formal recognition of innovation; their quantity reflects creative output and the market Significance of environmentally sustainable technologies. A higher number of effective invention patents signifies substantial technological advancement, a core green innovation output indicating successful conversion of research into valuable assets.
Sales revenue from new products measures economic benefit. This revenue quantifies the economic impact of green innovations and directly reflects market acceptance and commercial viability27. New product launches, especially environmentally sustainable ones, are major economic outputs in high-tech industries. They signal adoption by industries and broader markets. Sales revenue indicates economic growth and demonstrates proficiency in commercializing innovations. This variable is essential for evaluating financial returns on R&D investments and highlights how efficiently innovations yield sustainable economic benefits. Linking revenue to green products precisely assesses their contribution to profitability and growth.
An environmental pollution index weighted by emissions, carbon dioxide, chemical oxygen demand, ammonia nitrogen, sulfur dioxide, nitrogen oxides, and particulate matter quantifies environmental pollution9. Pollution is a non-desirable output, as high-tech industries generate negative environmental externalities despite progress. This index comprehensively measures industrial environmental impact through aggregated emissions. Including it explicitly acknowledges the trade-off between innovation and environmental degradation when balancing technological development with ecological sustainability. Industrial processes may still produce harmful emissions even as green innovations target pollution reduction. Measuring pollution is therefore essential for evaluating detrimental industrial byproducts and determining whether innovations genuinely advance environmental sustainability. Reducing pollution while enhancing technological and economic outputs defines the core challenge in assessing GIE.
Technological environment variable
Expenditure on technology introduction measures technological importation. This importation means spending to acquire foreign technologies, substantially augmenting national or industrial innovation capacity28. Such expenditure is vital for accessing advanced technologies, particularly in environmental sustainability. Importing foreign technologies enables industries to bypass traditional development and adopt more efficient, environmentally friendly solutions more quickly, thereby accelerating green innovation. Investment helps firms overcome knowledge deficits, adopt international best practices, and shorten internal R&D timeframes, leading to faster gains in GIE. Expenditure on technology introduction is therefore a key environmental factor for green innovation inputs.
Expenditure on digestion and absorption measures technological learning. This learning means how firms internalize, adapt, and enhance externally acquired technologies29. Such expenditure critically assesses how effectively high-tech industries adapt imported technologies to their contexts and innovate upon them. While importation provides immediate access, effective assimilation and modification are paramount for long-term success with green technologies. This spending reflects investments in human capital, training, and processes that maximize the absorption of foreign technologies. Strengthening learning capabilities improves firms’ ability to innovate and implement sustainable practices, making this variable essential for GIE.
Expenditure on technological transformation measures this process. Technological transformation converts new technologies or ideas into practical, scalable solutions30. This expenditure signifies resources allocated to turning research outputs, imported technologies, or novel ideas into operational, market-ready products. It ensures green innovations move beyond theory into production systems, advancing environmental sustainability. Achieving transformation often requires substantial investment in infrastructure, equipment upgrades, and organizational restructuring for new, greener technologies. This spending signals the industry’s dedication to converting innovation into tangible outcomes, aligning with economic and environmental objectives. Technological transformation expenditure is thus a key environmental factor influencing green innovation inputs that drive high-tech industry efficiency.
The number of high-tech enterprises measures technological support. This count reflects the prevalence of firms capable of advanced research and innovative technology adoption31. High-tech enterprises possess the infrastructure, expertise, and market orientation to drive technological innovation, including green technology development and deployment. A significant concentration creates a knowledge-sharing ecosystem, fostering collaborative ventures and competitive dynamics that enable green innovation. These firms pioneer emerging technologies and invest more readily in sustainable solutions. Therefore, their number constitutes an important environmental factor influencing the scale and scope of green innovation inputs. A larger population enhances collective innovation capacity, improving GIE.
Data source
The relevant data, delineated in Table 1, are sourced from the China Statistical Yearbook on Science and Technology (CSYST, 2007–2023), the China Statistical Yearbook on Environment (CSYE, 2007–2023), and the CEADs database (CEAD, www.ceads.net), by the specific variables considered.
Study area
The specific regional divisions are illustrated in Fig. 2. This study covers 30 provinces in China. Due to data availability constraints, the sample excludes Tibet, Taiwan, Hong Kong, and Macau. The study encompassed 30 provinces in China, excluding Tibet, Taiwan, Hong Kong, and Macao. The Northeast Region comprises Liaoning, Jilin, and Heilongjiang. The Northern Coastal Region includes Beijing, Tianjin, Hebei, and Shandong. The Eastern Coastal Region comprises Shanghai, Jiangsu, and Zhejiang. The Southern Coastal Region includes Fujian, Guangdong, and Hainan. The Yellow River Middle Region includes Shaanxi, Shanxi, Henan, and Inner Mongolia. The Yangtze River Middle Region comprises Hubei, Hunan, Jiangxi, and Anhui. The Southwest Region includes Yunnan, Guizhou, Sichuan, Chongqing, and Guangxi. The Northwest Region includes Gansu, Qinghai, Ningxia, Tibet, and Xinjiang.

The GIE of the high-tech industry is measured using the three-stage undesirable SBM model. In the first stage, the undesirable SBM model is used to measure the GIE of the high-tech industry without considering technological environmental factors while also obtaining the slack variables of the input indicators. In the second stage, a similar SFA model is applied to adjust the slack variables of the input indicators based on technological environmental factors. In the third stage, the undesirable SBM model is used again, incorporating the adjusted input indicators to measure the GIE of the high-tech industry while accounting for technological environmental factors. The key advantage of the three-stage undesirable SBM model lies in its ability to consider the impact of technological environmental factors on GIE, making the measurement results more accurately reflect the technological innovation characteristics of the high-tech industry3,5.
The three-stage undesirable SBM model consists of three components: the traditional undesirable SBM stage, the analogous SFA stage, and the adjusted undesirable SBM stage32,33. GIEi represents the GIE of the high-tech industry in the i-th (i = 1,3) stage. The settings for each stage of the model are outlined below.
Define the production possibility set (P({mathbf{x}},{mathbf{y}}^{e} ,{mathbf{y}}^{u} ))
$$P({mathbf{x}},{mathbf{y}}^{e} ,{mathbf{y}}^{u} ) = left{ {begin{array}{*{20}l} {x_{ik} = sumnolimits_{j = 1}^{n} {lambda_{j} x_{ij} } + w_{i}^{ – } } hfill & {i = 1, cdot cdot cdot ,m} hfill \ {y_{sk}^{e} = sumnolimits_{j = 1}^{n} {lambda_{j} y_{sj}^{e} } – w_{s}^{e} } hfill & {s = 1, cdot cdot cdot ,r_{1} } hfill \ {y_{qk}^{u} = sumnolimits_{j = 1}^{n} {lambda_{j} y_{qj}^{u} } + w_{q}^{u} } hfill & {q = 1, cdot cdot cdot ,r_{2} } hfill \ {lambda_{j} ge 0} hfill & {j = 1, cdot cdot cdot ,n} hfill \ end{array} } right}$$
(1)
Where λ represents the weight vector, w− represents the input slack variable, we signify the desirable output slack variable, and wu denotes the undesirable output slack variable. The formula for calculating GIE in the high-tech industry based on input orientation is as follows:
$$GIE^{1} = frac{1}{{mathop {min }limits_{{{varvec{uptheta}}}} {mathbf{theta x}} in P({mathbf{x}},{mathbf{y}}^{e} ,{mathbf{y}}^{u} )}}$$
(2)
An analogous SFA model mitigates the impact of environmental factors and random factors on input variables in the high-tech industry, where z represents the set of environmental factor variables. Adjusting input variables entails adjusting input slack variables, where environmental factors are considered independent variables and input slack variables are considered dependent variables in regression analysis.
$$w_{i}^{ – } = f_{i} ({mathbf{z}}|{{varvec{upbeta}}}_{i} ) + v_{i} + u_{i}$$
(3)
In Eq. (3), the stochastic factors (v_{i}) and managerial inefficiency (u_{i}) are mutually independent, independently, and identically distributed as (N(0,sigma_{v}^{2} )) and (N(0,sigma_{u}^{2} )), respectively. Let (sigma^{{2}} = sigma_{v}^{2} + sigma_{u}^{2}) and (gamma = {{sigma_{u}^{2} } mathord{left/ {vphantom {{sigma_{u}^{2} } {sigma^{{2}} }}} right. kern-0pt} {sigma^{{2}} }}) denote the variables. A value of (gamma) close to 1 indicates that managerial inefficiency predominantly influences input slack, whereas a value of (gamma) close to 0 suggests that stochastic factors predominantly affect input slack.
$$x_{i}^{*} = x_{i} + left[ {max f_{i} ({mathbf{z}}|{{varvec{upbeta}}}_{i} ) – f_{i} ({mathbf{z}}|{{varvec{upbeta}}}_{i} )} right] + left[ {max left( {v_{i} } right) – v_{i} } right]$$
(4)
By substituting the input vector x with the adjusted vector x* in Eq. (2), we can calculate the adjusted GIE of the high-tech industry.
$$GIE^{3} = frac{1}{{mathop {min }limits_{{{varvec{uptheta}}}} {mathbf{theta x}}^{*} in P({mathbf{x}}^{*} ,{mathbf{y}}^{e} ,{mathbf{y}}^{u} )}}$$
(5)
Theil index
We use the Theil index to assess the disparity in GIE within China’s high-tech industry and analyze this variation across eight economic regions. The Theil index, a standard measure of inequality, is particularly suitable for evaluating GIE disparity because it decomposes total inequality into within-group and between-group components. This decomposition provides insights into whether intra-regional or inter-regional differences primarily drive the disparity18. The choice of the Theil index is based on its ability to handle varying inequality levels across subgroups, offering a comprehensive view of GIE distribution. It accounts for the extent of disparity and the relative contribution of each region to the overall inequality, allowing for a detailed analysis of how different regions contribute to the national GIE disparity34.
T represents the total disparity nationwide, where Tb and Tw denote the between-region disparity and within-region disparity within the eight economic regions, respectively, while Tk signifies the disparity within the k-th economic region34.
$$T = frac{1}{n}sumnolimits_{i = 1}^{n} {frac{{GIE_{i} }}{{overline{GIE} }} times log frac{{GIE_{i} }}{{overline{GIE} }}} = sumnolimits_{k = 1}^{K} {frac{{n_{k} overline{{GIE_{k} }} }}{{noverline{GIE} }} times log frac{{overline{{GIE_{k} }} }}{{overline{GIE} }}} + sumnolimits_{k = 1}^{K} {frac{{n_{k} overline{{GIE_{k} }} }}{{noverline{GIE} }} times T_{k} }$$
(6)
$$T_{b} = sumnolimits_{k = 1}^{K} {frac{{n_{k} overline{{GIE_{k} }} }}{{noverline{GIE} }} times log frac{{overline{{GIE_{k} }} }}{{overline{GIE} }}}$$
(7)
$$T_{w} = sumnolimits_{k = 1}^{K} {frac{{n_{k} overline{{GIE_{k} }} }}{{noverline{GIE} }} times T_{k} }$$
(8)
$$T_{k} = frac{1}{{n_{k} }}sumnolimits_{i = 1}^{{n_{k} }} {frac{{GIE_{ki} }}{{overline{{GIE_{k} }} }} times log frac{{GIE_{ki} }}{{overline{{GIE_{k} }} }}}$$
(9)
Where n represents the total number of provinces, K represents the number of delineated economic regions, and nk denotes the number of provinces within the k-th ((k = 1,2, cdot cdot cdot ,K)) economic region; (GIE_{i}) represents the GIE of the i-th ((i = 1,2, cdot cdot cdot ,n)) province, (overline{GIE}) represents the average GIE of all provinces, (GIE_{ki}) represents the GIE of the i-th province within the k-th economic region, and (overline{{GIE_{k} }}) represents the average GIE of all provinces within the k-th economic region.
Moran index
We use the Moran index to examine the spatial clustering characteristics of GIE in China’s high-tech industry. The Moran index is a widely used spatial statistic that measures spatial dependence, allowing us to assess the extent to which similar values cluster geographically14. This clustering reveals patterns of how GIE is distributed across regions. China’s high-tech industry shows significant spatial heterogeneity, with regional variations in technological capabilities, resource allocation, policy environments, and industrial development leading to uneven green innovation outcomes. By applying the Moran index, we can identify whether regions with high or low GIE values tend to cluster or if there is a random distribution. This information is crucial for policymakers to pinpoint areas where regional green innovation efforts complement or hinder one another.
The global Moran index is employed to test the overall spatial clustering effect.
$$I = frac{{sumnolimits_{i = 1}^{n} {sumnolimits_{j = 1}^{n} {w_{ij} (GIE_{i} – overline{GIE} )(GIE_{j} – overline{GIE} )} } }}{{s^{2} sumnolimits_{i = 1}^{n} {sumnolimits_{j = 1}^{n} {w_{ij} } } }}$$
(10)
Where s2 represents the variance of GIE across all provinces, and wij denotes the inverse distance spatial weights calculated using latitude and longitude data. Utilizing the local Moran index, we conduct a test for local spatial clustering effects, where the local Moran index for the i-th province is provided in Eq. (11). Combining the local Moran index enables the construction of LISA cluster maps.
$$I_{i} = frac{{GIE_{i} – overline{GIE} }}{{s^{2} }}sumnolimits_{j = 1}^{n} {w_{ij} (GIE_{j} – overline{GIE} )}$$
(11)
Standard deviation ellipse
We use the Standard Deviation Ellipse (SDE) to illustrate the clustering center and range of GIE in China’s high-tech industry. The SDE provides a geometric representation of the spatial dispersion of GIE across regions, showing the extent to which GIE values deviate from the mean in both horizontal and vertical directions. The ellipse’s center corresponds to the mean GIE, serving as a reference for the central tendency. The ellipse’s orientation indicates the direction of GIE concentration, highlighting areas with higher or lower innovation efficiency. The axes represent the standard deviations in two dimensions, with the longer axis reflecting a greater spread of GIE and the shorter axis suggesting more uniformity in the data distribution14. This approach aids in understanding the geographical range of GIE performance within the high-tech industry.
Representing the geographical coordinates of the i-th province as ((x_{i} ,y_{i} )), the geographical centroid of GIE in China’s high-tech industry is also denoted by ((x_{GIE} ,y_{GIE} )).
$$x_{GIE} = frac{{sumnolimits_{i = 1}^{n} {GIE_{i} times x_{i} } }}{{sumnolimits_{i = 1}^{n} {GIE_{i} } }}, y_{GIE} = frac{{sumnolimits_{i = 1}^{n} {GIE_{i} times y_{i} } }}{{sumnolimits_{i = 1}^{n} {GIE_{i} } }}$$
(12)
Calculate the length of the significant semi-axis (standard deviation along the x-axis) and the minor semi-axis (standard deviation along the y-axis) of the SDE. Use ((tilde{x}_{i} ,tilde{y}_{i} )) ((tilde{x}_{i} = x_{i} – x_{GIE}), (tilde{y}_{i} = y_{i} – y_{GIE})) to represent the relative coordinates of the i-th province concerning the centroid position. Let
$$A = sumnolimits_{i = 1}^{n} {(GIE_{i} times tilde{x}_{i} )^{2} } – sumnolimits_{i = 1}^{n} {(GIE_{i} times tilde{y}_{i} )^{2} }$$
$$B = sumnolimits_{i = 1}^{n} {GIE_{i} times tilde{x}_{i} times tilde{y}_{i} }$$
$$theta = arctan frac{{A + sqrt {A^{2} + 4B^{2} } }}{2B}$$
Finally, the standard deviations along the x-axis and the y-axis are obtained.
$$sigma_{x} = sqrt {frac{{sumnolimits_{i = 1}^{n} {(GIE_{i} times tilde{x}_{i} times cos theta – GIE_{i} times tilde{y}_{i} times sin theta )^{2} } }}{{sumnolimits_{i = 1}^{n} {GIE_{i}^{2} } }}}, sigma_{y} = sqrt {frac{{sumnolimits_{i = 1}^{n} {(GIE_{i} times tilde{x}_{i} times sin theta – GIE_{i} times tilde{y}_{i} times cos theta )^{2} } }}{{sumnolimits_{i = 1}^{n} {GIE_{i}^{2} } }}}$$
(13)
Spatial Markov chain
Using spatial Markov chain (SMC) to illustrate the dynamic transition trends and convergence characteristics of GIE in China’s high-tech industry. Spatial Markov chains allow for the examination of spatial dependencies by considering how regions’ GIE levels are influenced by their characteristics and those of neighboring regions. This is crucial for understanding the regional diffusion of green technologies and innovation practices within the high-tech industry. Markov chain effectively analyzes dynamic transitions by modeling the probabilistic movement of regions between different GIE states over time. The SMC model accounts for shifts in GIE states due to policy interventions, technological advancements, and regional economic changes. Using SMC, we can track the likelihood of these transitions and identify patterns of progress or stagnation in regional GIE performance, providing a comprehensive view of its dynamic behavior19.
Classifying GIE in high-tech industry into L types, where (n_{i}^{t}) represents the number of provinces belonging to type i in year t, and (n_{ij}^{t,t + T}) represents the number of provinces belonging to type i in year t and transitioning to type j after T years, the state transition probability of high-tech industry GIE shifting from type i in year t to type j in year t + T nationwide is as follows.
$$P_{ij}^{t,t + T} = frac{{sumnolimits_{2006}^{2022 – T} {n_{ij}^{t,t + T} } }}{{sumnolimits_{2006}^{2022 – T} {n_{i}^{t} } }}, i,j = 1,2, cdot cdot cdot ,L$$
(14)
Based on Eq. (14), incorporating the lagged term of GIE, we ultimately derive the (L times L times L) order SMC.
β-Convergence model
The β-convergence model is used to analyze the dynamic convergence trends of GIE in China’s high-tech industry. It helps determine whether GIE in different provinces is converging. If regions with lower initial GIE values experience faster growth, the β-convergence model suggests a reduction in spatial disparities, indicating a trend toward more efficient GIE over time35. Identifying β-convergence in China’s high-tech industry provides insights for policymakers, helping assess the effectiveness of strategies like green innovation promotion and sustainable development policies in narrowing regional gaps in GIE and supporting sustainable growth. The model allows for regional comparisons, serving as a benchmark to evaluate the impact of local environmental and innovation policies on GIE enhancement. Analyzing convergence trends can also reveal whether lagging areas are catching up with leading regions, potentially signaling successful innovation diffusion and policy interventions2.
Formulating a panel data model as follows:
$$log frac{{GIE_{i,t} }}{{GIE_{i,t – 1} }} = beta log GIE_{i,t – 1} + u_{i} + v_{t} + varepsilon_{i,t}$$
(15)
In Eq. (15), β represents the convergence coefficient, where β < 0 indicates that the GIE of the high-tech industry is converging, with the corresponding convergence rate denoted by (phi = – ln (1 + beta )). Conversely, β > 0 signifies a divergent state of GIE in the high-tech industry, while β = 0 denotes a state of relative equilibrium in GIE. The study employs a quantile regression method with temporal and spatial fixed effects for estimation to validate dynamic convergence trends across different efficiency levels.