Smart Agriculture

Select

Information Processing and Decision Making

A Lightweight Model for Detecting Small Targets of Litchi Pests Based on Improved YOLOv10n

LIZusheng, TANGJishen, KUANGYingchun

Smart Agriculture. 2025, 7(2): 146-159. https://doi.org/10.12133/j.smartag.SA202412003

Abstract (131) PDF (40) HTML (101)

Knowledge map

Save

Objective The accuracy of identifying litchi pests is crucial for implementing effective control strategies and promoting sustainable agricultural development. However, the current detection of litchi pests is characterized by a high percentage of small targets, which makes target detection models challenging in terms of accuracy and parameter count, thus limiting their application in real-world production environments. To improve the identification efficiency of litchi pests, a lightweight target detection model YOLO-LP (YOLO-Litchi Pests) based on YOLOv10n was proposed. The model aimed to enhance the detection accuracy of small litchi pest targets in multiple scenarios by optimizing the network structure and loss function, while also reducing the number of parameters and computational costs. Methods Two classes of litchi insect pests (Cocoon and Gall) images were collected as datasets for modeling in natural scenarios (sunny, cloudy, post-rain) and laboratory environments. The original data were expanded through random scaling, random panning, random brightness adjustments, random contrast variations, and Gaussian blurring to balance the category samples and enhance the robustness of the model, generating a richer dataset named the CG dataset (Cocoon and Gall dataset). The YOLO-LP model was constructed after the following three improvements. Specifically, the C2f module of the backbone network (Backbone) in YOLOv10n was optimized and the C2f_GLSA module was constructed using the global-to-local spatial aggregation (GLSA) module to focus on small targets and enhance the differentiation between the targets and the backgrounds, while simultaneously reducing the number of parameters and computation. A frequency-aware feature fusion module (FreqFusion) was introduced into the neck network (Neck) of YOLOv10n and a frequency-aware path aggregation network (FreqPANet) was designed to reduce the complexity of the model and address the problem of fuzzy and shifted target boundaries. The SCYLLA-IoU (SIoU) loss function replaced the Complete-IoU (CIoU) loss function from the baseline model to optimize the target localization accuracy and accelerate the convergence of the training process. Results and Discussions YOLO-LP achieved 90.9%, 62.2%, and 59.5% for AP₅₀, AP_50:95, and AP-Small_50:95 in the CG dataset, respectively, and 1.9%, 1.0%, and 1.2% higher than the baseline model. The number of parameters and the computational costs were reduced by 13% and 17%, respectively. These results suggested that YOLO-LP had a high accuracy and lightweight design. Comparison experiments with different attention mechanisms validated the effectiveness of the GLSA module. After the GLSA module was added to the baseline model, AP₅₀, AP_50:95, and AP-Small_50:95 achieved the highest performance in the CG dataset, reaching 90.4%, 62.0%, and 59.5%, respectively. Experiment results comparing different loss functions showed that the SIoU loss function provided better fitting and convergence speed in the CG dataset. Ablation test results revealed that the validity of each model improvement and the detection performance of any combination of the three improvements was significantly better than the baseline model in the YOLO-LP model. The performance of the models was optimal when all three improvements were applied simultaneously. Compared to several mainstream models, YOLO-LP exhibited the best overall performance, with a model size of only 5.1 MB, 1.97 million parameters (Params), and a computational volume of 5.4 GFLOPs. Compared to the baseline model, the detection of the YOLO-LP performance was significantly improved across four multiple scenarios. In the sunny day scenario, AP₅₀, AP_50:95, and AP-Small_50:95 increased by 1.9%, 1.0 %, and 2.0 %, respectively. In the cloudy day scenario, AP₅₀, AP_50:95, and AP-Small_50:95 increased by 2.5%, 1.3%, and 1.3%, respectively. In the post-rain scenario, AP₅₀, AP_50:95, and AP-Small_50:95 increased by 2.0%, 2.4%, and 2.4%, respectively. In the laboratory scenario, only AP₅₀ increased by 0.7% over the baseline model. These findings indicated that YOLO-LP achieved higher accuracy and robustness in multi-scenario small target detection of litchi pests. Conclusions The proposed YOLO-LP model could improve detection accuracy and effectively reduce the number of parameters and computational costs. It performed well in small target detection of litchi pests and demonstrated strong robustness across different scenarios. These improvements made the model more suitable for deployment on resource-constrained mobile and edge devices. The model provided a valuable technical reference for small target detection of litchi pests in various scenarios.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms(Part 1)

Vegetable Crop Growth Modeling in Digital Twin Platform Based on Large Language Model Inference

ZHAOChunjiang, LIJingchen, WUHuarui, YANGYusen

Smart Agriculture. 2024, 6(6): 63-71. https://doi.org/10.12133/j.smartag.SA202410008

Abstract (129) PDF (30) HTML (117)

Knowledge map

Save

[Objective] In the era of digital agriculture, real-time monitoring and predictive modeling of crop growth are paramount, especially in autonomous farming systems. Traditional crop growth models, often constrained by their reliance on static, rule-based methods, fail to capture the dynamic and multifactorial nature of vegetable crop growth. This research tried to address these challenges by leveraging the advanced reasoning capabilities of pre-trained large language models (LLMs) to simulate and predict vegetable crop growth with accuracy and reliability. Modeling the growth of vegetable crops within these platforms has historically been hindered by the complex interactions among biotic and abiotic factors. [Methods] The methodology was structured in several distinct phases. Initially, a comprehensive dataset was curated to include extensive information on vegetable crop growth cycles, environmental conditions, and management practices. This dataset incorporates continuous data streams such as soil moisture, nutrient levels, climate variables, pest occurrence, and historical growth records. By combining these data sources, the study ensured that the model was well-equipped to understand and infer the complex interdependencies inherent in crop growth processes. Then, advanced techniques was emploied for pre-training and fine-tuning LLMs to adapt them to the domain-specific requirements of vegetable crop modeling. A staged intelligent agent ensemble was designed to work within the digital twin platform, consisting of a central managerial agent and multiple stage-specific agents. The managerial agent was responsible for identifying transitions between distinct growth stages of the crops, while the stage-specific agents were tailored to handle the unique characteristics of each growth phase. This modular architecture enhanced the model's adaptability and precision, ensuring that each phase of growth received specialized attention and analysis. [Results and Discussions] The experimental validation of this method was conducted in a controlled agricultural setting at the Xiaotangshan Modern Agricultural Demonstration Park in Beijing. Cabbage (Zhonggan 21) was selected as the test crop due to its significance in agricultural production and the availability of comprehensive historical growth data. Over five years, the dataset collected included 4 300 detailed records, documenting parameters such as plant height, leaf count, soil conditions, irrigation schedules, fertilization practices, and pest management interventions. This dataset was used to train the LLM-based system and evaluate its performance using ten-fold cross-validation. The results of the experiments demonstrating the efficacy of the proposed system in addressing the complexities of vegetable crop growth modeling. The LLM-based model achieved 98% accuracy in predicting crop growth degrees and a 99.7% accuracy in identifying growth stages. These metrics significantly outperform traditional machine learning approaches, including long short-term memory (LSTM), XGBoost, and LightGBM models. The superior performance of the LLM-based system highlights its ability to reason over heterogeneous data inputs and make precise predictions, setting a new benchmark for crop modeling technologies. Beyond accuracy, the LLM-powered system also excels in its ability to simulate growth trajectories over extended periods, enabling farmers and agricultural managers to anticipate potential challenges and make proactive decisions. For example, by integrating real-time sensor data with historical patterns, the system can predict how changes in irrigation or fertilization practices will impact crop health and yield. This predictive capability is invaluable for optimizing resource allocation and mitigating risks associated with climate variability and pest outbreaks. [Conclusions] The study emphasizes the importance of high-quality data in achieving reliable and generalizable models. The comprehensive dataset used in this research not only captures the nuances of cabbage growth but also provides a blueprint for extending the model to other crops. In conclusion, this research demonstrates the transformative potential of combining large language models with digital twin technology for vegetable crop growth modeling. By addressing the limitations of traditional modeling approaches and harnessing the advanced reasoning capabilities of LLMs, the proposed system sets a new standard for precision agriculture. Several avenues also are proposed for future work, including expanding the dataset, refining the model architecture, and developing multi-crop and multi-region capabilities.

Select

Technology and Method

MSH-YOLOv8: Mushroom Small Object Detection Method with Scale Reconstruction and Fusion

YEDapeng, JINGJun, ZHANGZhide, LIHuihuang, WUHaoyu, XIELimin

Smart Agriculture. 2024, 6(5): 139-152. https://doi.org/10.12133/j.smartag.SA202404002

Abstract (116) PDF (15) HTML (90)

Knowledge map

Save

[Objective] Traditional object detection algorithms applied in the agricultural field, such as those used for crop growth monitoring and harvesting, often suffer from insufficient accuracy. This is particularly problematic for small crops like mushrooms, where recognition and detection are more challenging. The introduction of small object detection technology promises to address these issues, potentially enhancing the precision, efficiency, and economic benefits of agricultural production management. However, achieving high accuracy in small object detection has remained a significant challenge, especially when dealing with varying image sizes and target scales. Although the YOLO series models excel in speed and large object detection, they still have shortcomings in small object detection. To address the issue of maintaining high accuracy amid changes in image size and target scale, a novel detection model, Multi-Strategy Handling YOLOv8 (MSH-YOLOv8), was proposed. [Methods] The proposed MSH-YOLOv8 model builds upon YOLOv8 by incorporating several key enhancements aimed at improving sensitivity to small-scale targets and overall detection performance. Firstly, an additional detection head was added to increase the model's sensitivity to small objects. To address computational redundancy and improve feature extraction, the Swin Transformer detection structure was introduced into the input module of the head network, creating what was termed the "Swin Head (SH)". Moreover, the model integrated the C2f_Deformable convolutionv4 (C2f_DCNv4) structure, which included deformable convolutions, and the Swin Transformer encoder structure, termed "Swinstage", to reconstruct the YOLOv8 backbone network. This optimization enhanced feature propagation and extraction capabilities, increasing the network's ability to handle targets with significant scale variations. Additionally, the normalization-based attention module (NAM) was employed to improve performance without compromising detection speed or computational complexity. To further enhance training efficacy and convergence speed, the original loss function CIoU was replaced with wise-intersection over union (WIoU) Loss. Furthermore, experiments were conducted using mushrooms as the research subject on the open Fungi dataset. Approximately 200 images with resolution sizes around 600×800 were selected as the main research material, along with 50 images each with resolution sizes around 200×400 and 1 000×1 200 to ensure representativeness and generalization of image sizes. During the data augmentation phase, a generative adversarial network (GAN) was utilized for resolution reconstruction of low-resolution images, thereby preserving semantic quality as much as possible. In the post-processing phase, dynamic resolution training, multi-scale testing, soft non-maximum suppression (Soft-NMS), and weighted boxes fusion (WBF) were applied to enhance the model's small object detection capabilities under varying scales. [Results and Discussions] The improved MSH-YOLOv8 achieved an average precision at 50% (AP50) intersection over union of 98.49% and an AP@50-95 of 75.29%, with the small object detection metric APs reaching 39.73%. Compared to mainstream models like YOLOv8, these metrics showed improvements of 2.34%, 4.06% and 8.55%, respectively. When compared to the advanced TPH-YOLOv5 model, the improvements were 2.14%, 2.76% and 6.89%, respectively. The ensemble model, MSH-YOLOv8-ensemble, showed even more significant improvements, with AP50 and APs reaching 99.14% and 40.59%, respectively, an increase of 4.06% and 8.55% over YOLOv8. These results indicate the robustness and enhanced performance of the MSH-YOLOv8 model, particularly in detecting small objects under varying conditions. Further application of this methodology on the Alibaba Cloud Tianchi databases "Tomato Detection" and "Apple Detection" yielded MSH-YOLOv8-t and MSH-YOLOv8-a models (collectively referred to as MSH-YOLOv8). Visual comparison of detection results demonstrated that MSH-YOLOv8 significantly improved the recognition of dense and blurry small-scale tomatoes and apples. This indicated that the MSH-YOLOv8 method possesses strong cross-dataset generalization capability and effectively recognizes small-scale targets. In addition to quantitative improvements, qualitative assessments showed that the MSH-YOLOv8 model could handle complex scenarios involving occlusions, varying lighting conditions, and different growth stages of the crops. This demonstrates the practical applicability of the model in real-world agricultural settings, where such challenges are common. [Conclusions] The MSH-YOLOv8 improvement method proposed in this study effectively enhances the detection accuracy of small mushroom targets under varying image sizes and target scales. This approach leverages multiple strategies to optimize both the architecture and the training process, resulting in a robust model capable of high-precision small object detection. The methodology's application to other datasets, such as those for tomato and apple detection, further underscores its generalizability and potential for broader use in agricultural monitoring and management tasks.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms(Part 1)

Research on the Spatio-temporal Characteristics and Driving Factors of Smart Farm Development in the Yangtze River Economic Belt

GAOQun, WANGHongyang, CHENShiyao

Smart Agriculture. 2024, 6(6): 168-179. https://doi.org/10.12133/j.smartag.SA202404005

Abstract (97) PDF (12) HTML (69)

Knowledge map

Save

[Objective] In order to summarize exemplary cases of high-quality development in regional smart agriculture and contribute strategies for the sustainable advancement of the national smart agriculture cause, the spatiotemporal characteristics and key driving factors of smart farms in the Yangtze River Economic Belt were studied. [Methods] Based on data from 11 provinces (municipalities) spanning the years 2014 to 2023, a comprehensive analysis was conducted on the spatio-temporal differentiation characteristics of smart farms in the Yangtze River Economic Belt using methods such as kernel density analysis, spatial auto-correlation analysis, and standard deviation ellipse. Including the overall spatial clustering characteristics, high-value or low-value clustering phenomena, centroid characteristics, and dynamic change trends. Subsequently, the geographic detector was employed to identify the key factors driving the spatio-temporal differentiation of smart farms and to discern the interactions between different factors. The analysis was conducted across seven dimensions: special fiscal support, industry dependence, human capital, urbanization, agricultural mechanization, internet infrastructure, and technological innovation. [Results and Discussions] Firstly, in terms of temporal characteristics, the number of smart farms in the Yangtze River Economic Belt steadily increased over the past decade. The year 2016 marked a significant turning point, after which the growth rate of smart farms had accelerated noticeably. The development of the upper, middle, and lower reaches exhibited both commonalities and disparities. Specifically, the lower sub-regions got a higher overall development level of smart farms, with a fluctuating upward growth rate; the middle sub-regions were at a moderate level, showing a fluctuating upward growth rate and relatively even provincial distribution; the upper sub-regions got a low development level, with a stable and slow growth rate, and an unbalanced provincial distribution. Secondly, in terms of spatial distribution, smart farms in the Yangtze River Economic Belt exhibited a dispersed agglomeration pattern. The results of global auto-correlation indicated that smart farms in the Yangtze River Economic Belt tended to be randomly distributed. The results of local auto-correlation showed that the predominant patterns of agglomeration were H-L and L-H types, with the distribution across provinces being somewhat complex; H-H type agglomeration areas were mainly concentrated in Sichuan, Hubei, and Anhui; L-L type agglomeration areas were primarily in Yunnan and Guizhou. The standard deviation ellipse results revealed that the mean center of smart farms in the Yangtze River Economic Belt had shifted from Anqing city in Anhui province in 2014 to Jingzhou city in Hubei province in 2023, with the spatial distribution showing an overall trend of shifting southwestward and a slow expansion toward the northeast and south. Finally, in terms of key driving factors, technological innovation was the primary critical factor driving the formation of the spatio-temporal distribution pattern of smart farms in the Yangtze River Economic Belt, with a factor explanatory degree of 0.311 1. Moreover, after interacting with other indicators, it continued to play a crucial role in the spatio-temporal distribution of smart farms, which aligned with the practical logic of smart farm development. Urbanization and agricultural mechanization levels were the second and third largest key factors, with factor explanatory degrees of 0.292 2 and 0.251 4, respectively. The key driving factors for the spatio-temporal differentiation of smart farms in the upper, middle, and lower sub-regions exhibited both commonalities and differences. Specifically, the top two key factors driver identification in the upper region were technological innovation (0.841 9) and special fiscal support (0.782 3). In the middle region, they were technological innovation (0.619 0) and human capital (0.600 1), while in the lower region, they were urbanization (0.727 6) and technological innovation (0.425 4). The identification of key driving factors and the detection of their interactive effects further confirmed that the spatio-temporal distribution characteristics of smart farms in the Yangtze River Economic Belt were the result of the comprehensive action of multiple factors. [Conclusions] The development of smart farms in the Yangtze River Economic Belt is showing a positive momentum, with both the total number of smart farms and the number of sub-regions experiencing stable growth. The development speed and level of smart farms in the sub-regions exhibit a differentiated characteristic of "lower reaches > middle reaches > upper reaches". At the same time, the overall distribution of smart farms in the Yangtze River Economic Belt is relatively balanced, with the degree of sub-regional distribution balance being "middle reaches (Hubei province, Hunan province, Jiangxi province are balanced) > lower reaches (dominated by Anhui) > upper reaches (Sichuan stands out)". The coverage of smart farm site selection continues to expand, forming a "northeast-southwest" horizontal diffusion pattern. In addition, the spatio-temporal characteristics of smart farms in the Yangtze River Economic Belt are the result of the comprehensive action of multiple factors, with the explanatory power of factors ranked from high to low as follows: Technological innovation > urbanization > agricultural mechanization > human capital > internet infrastructure > industry dependence > special fiscal support. Moreover, the influence of each factor is further strengthened after interaction. Based on these conclusions, suggestions are proposed to promote the high-quality development of smart farms in the Yangtze River Economic Belt. This study not only provides a theoretical basis and reference for the construction of smart farms in the Yangtze River Economic Belt and other regions, but also helps to grasp the current status and future trends of smart farm development.

Select

Topic--Development and Application of the Big Data Platform for Grain Production

Improvement of HLM Modeling for Winter Wheat Yield Estimation Under Drought Conditions

ZHAOPeiqin, LIUChangbin, ZHENGJie, MENGYang, MEIXin, TAOTing, ZHAOQian, MEIGuangyuan, YANGXiaodong

Smart Agriculture. 2025, 7(2): 106-116. https://doi.org/10.12133/j.smartag.SA202408009

Abstract (83) PDF (13) HTML (67)

Knowledge map

Save

[Objective] Winter wheat yield is crucial for national food security and the standard of living of the population. Existing crop yield prediction models often show low accuracy under disaster-prone climatic conditions. This study proposed an improved hierarchical linear model (IHLM) based on a drought weather index reduction rate, aiming to enhance the accuracy of crop yield estimation under drought conditions. [Methods] HLM was constructed using the maximum enhanced vegetation index-2 (EVI2max), meteorological data (precipitation, radiation, and temperature from March to May), and observed winter wheat yield data from 160 agricultural survey stations in Shandong province (2018－2021). To validate the model's accuracy, 70% of the data from Shandong province was randomly selected for model construction, and the remaining data was used to validate the accuracy of the yield model. HLM considered the variation in meteorological factors as a key obstacle affecting crop growth and improved the model by calculating the relative meteorological factors. The calculation of relative meteorological factors helped reduce the impact of inter-annual differences in meteorological data. The accuracy of the HLM model was compared with that of the random forest (RF), Support Vector Regression (SVR), and Extreme Gradient Boosting (XGBoost) models. The HLM model provided more intuitive interpretation, especially suitable for processing hierarchical data, which helped capture the variability of winter wheat yield data under drought conditions. Therefore, a drought weather index reduction rate model from the agricultural insurance industry was introduced to further optimize the HLM model, resulting in the construction of the IHLM model. The IHLM model was designed to improve crop yield prediction accuracy under drought conditions. Since the precipitation differences between Henan and Shandong provinces were small, to test the transferability of the IHLM model, Henan province sample data was processed in the same way as in Shandong, and the IHLM model was applied to Henan province to evaluate its performance under different geographical conditions. [Results and Discussions] The accuracy of the HLM model, improved based on relative meteorological factors (rMF), was higher than that of RF, SVR, and XGBoost. The validation accuracy showed a Pearson correlation coefficient (r) of 0.76, a root mean squared error (RMSE) of 0.60 t/hm², and a normalized RMSE (nRMSE) of 11.21%. In the drought conditions dataset, the model was further improved by incorporating the relationship between the winter wheat drought weather index and the reduction rate of winter wheat yield. After the improvement, the RMSE decreased by 0.48 t/hm², and the nRMSE decreased by 28.64 percentage points, significantly enhancing the accuracy of the IHLM model under drought conditions. The IHLM model also demonstrated good applicability when transferred to Henan province. [Conclusions] The IHLM model developed in this study improved the accuracy and stability of crop yield predictions, especially under drought conditions. Compared to RF, SVR, and XGBoost models, the IHLM model was more suitable for predicting winter wheat yield. This research can be widely applied in the agricultural insurance field, playing a significant role in the design of agricultural insurance products, rate setting, and risk management. It enables more accurate predictions of winter wheat yield under drought conditions, with results that are closer to actual outcomes.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms(Part 1)

Lightweight YOLOv8s-Based Strawberry Plug Seedling Grading Detection and Localization via Channel Pruning

CHEN Junlin, ZHAO Peng, CAO Xianlin, NING Jifeng, YANG Shuqin

Smart Agriculture. 2024, 6(6): 132-143. https://doi.org/10.12133/j.smartag.SA202408001

Abstract (74) PDF (13) HTML (56)

Knowledge map

Save

[Objective] Plug tray seedling cultivation is a contemporary method known for its high germination rates, uniform seedling growth, shortened transplant recovery period, diminished pest and disease incidence, and enhanced labor efficiency. Despite these advantages, challenges such as missing or underdeveloped seedlings can arise due to seedling quality and environmental factors. To ensure uniformity and consistency of the seedlings, sorting is frequently necessary, and the adoption of automated seedling sorting technology can significantly reduce labor costs. Nevertheless, the overgrowth of seedlings within the plugs can effect the accuracy of detection algorithms. A method for grading and locating strawberry seedlings based on a lightweight YOLOv8s model was presented in this research to effectively mitigate the interference caused by overgrown seedlings. [Methods] The YOLOv8s model was selected as the baseline for detecting different categories of seedlings in the strawberry plug tray cultivation process, namely weak seedlings, normal seedlings, and plug holes. To improve the detection efficiency and reduce the model's computational cost, the layer-adaptive magnitude-based pruning(LAMP) score-based channel pruning algorithm was applied to compress the base YOLOv8s model. The pruning procedure involved using the dependency graph to derive the group matrices, followed by normalizing the group importance scores using the LAMP Score, and ultimately pruning the channels according to these processed scores. This pruning strategy effectively reduced the number of model parameters and the overall size of the model, thereby significantly enhancing its inference speed while maintaining the capability to accurately detect both seedlings and plug holes. Furthermore, a two-stage seedling-hole matching algorithm was introduced based on the pruned YOLOv8s model. In the first stage, seedling and plug hole bounding boxes were matched according to their the degree of overlap (Dp), resulting in an initial set of high-quality matches. This step helped minimize the number of potential matching holes for seedlings exhibiting overgrowth. Subsequently, before the second stage of matching, the remaining unmatched seedlings were ranked according to their potential matching hole scores (S), with higher scores indicating fewer potential matching holes. The seedlings were then prioritized during the second round of matching based on these scores, thus ensuring an accurate pairing of each seedling with its corresponding plug hole, even in cases where adjacent seedling leaves encroached into neighboring plug holes. [Results and Discussions] The pruning process inevitably resulted in the loss of some parameters that were originally beneficial for feature representation and model generalization. This led to a noticeable decline in model performance. However, through meticulous fine-tuning, the model's feature expression capabilities were restored, compensating for the information loss caused by pruning. Experimental results demonstrated that the fine-tuned model not only maintained high detection accuracy but also achieved significant reductions in FLOPs (86.3%) and parameter count (95.4%). The final model size was only 1.2 MB. Compared to the original YOLOv8s model, the pruned version showed improvements in several key performance metrics: precision increased by 0.4%, recall by 1.2%, mAP by 1%, and the F₁-Score by 0.1%. The impact of the pruning rate on model performance was found to be non-linear. As the pruning rate increased, model performance dropped significantly after certain crucial channels were removed. However, further pruning led to a reallocation of the remaining channels' weights, which in some cases allowed the model to recover or even exceed its previous performance levels. Consequently, it was necessary to experiment extensively to identify the optimal pruning rate that balanced model accuracy and speed. The experiments indicated that when the pruning rate reached 85.7%, the mAP peaked at 96.4%. Beyond this point, performance began to decline, suggesting that this was the optimal pruning rate for achieving a balance between model efficiency and performance, resulting in a model size of 1.2 MB. To further validate the improved model's effectiveness, comparisons were conducted with different lightweight backbone networks, including MobileNetv3, ShuffleNetv2, EfficientViT, and FasterNet, while retaining the Neck and Head modules of the original YOLOv8s model. Results indicated that the modified model outperformed these alternatives, with mAP improvements of 1.3%, 1.8%, 1.5%, and 1.1%, respectively, and F₁-Score increases of 1.5%, 1.8%, 1.1%, and 1%. Moreover, the pruned model showed substantial advantages in terms of floating-point operations, model size, and parameter count compared to these other lightweight networks. To verify the effectiveness of the proposed two-stage seedling-hole matching algorithm, tests were conducted using a variety of complex images from the test set. Results indicated that the proposed method achieved precise grading and localization of strawberry seedlings even under challenging overgrowth conditions. Specifically, the correct matching rate for normal seedlings reached 96.6%, for missing seedlings 84.5%, and for weak seedlings 82.9%, with an average matching accuracy of 88%, meeting the practical requirements of the strawberry plug tray cultivation process. [Conclusions] The pruned YOLOv8s model successfully maintained high detection accuracy while reducing computational costs and improving inference speed. The proposed two-stage seedling-hole matching algorithm effectively minimized the interference caused by overgrown seedlings, accurately locating and classifying seedlings of various growth stages within the plug tray. The research provides a robust and reliable technical solution for automated strawberry seedling sorting in practical plug tray cultivation scenarios, offering valuable insights and technical support for optimizing the efficiency and precision of automated seedling grading systems.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms(Part 1)

Image Segmentation Method of Chinese Yam Leaves in Complex Background Based on Improved ENet

LU Bibo, LIANG Di, YANG Jie, SONG Aiqing, HUANGFU Shangwei

Smart Agriculture. 2024, 6(6): 109-120. https://doi.org/10.12133/j.smartag.SA202407007

Abstract (59) PDF (5) HTML (44)

Knowledge map

Save

[Objective] Crop leaf area is an important indicator reflecting light absorption efficiency and growth conditions. This paper established a diverse Chinese yam image dataset and proposesd a deep learning-based method for Chinese yam leaf image segmentation. This method can be used for real-time measurement of Chinese yam leaf area, addressing the inefficiency of traditional measurement techniques. This will provide more reliable data support for genetic breeding, growth and development research of Chinese yam, and promote the development and progress of the Chinese yam industry. [Methods] A lightweight segmentation network based on improved ENet was proposed. Firstly, based on ENet, the third stage was pruned to reduce redundant calculations in the model. This improved the computational efficiency and running speed, and provided a good basis for real-time applications. Secondly, PConv was used instead of the conventional convolution in the downsampling bottleneck structure and conventional bottleneck structure, the improved bottleneck structure was named P-Bottleneck. PConv applied conventional convolution to only a portion of the input channels and left the rest of the channels unchanged, which reduced memory accesses and redundant computations for more efficient spatial feature extraction. PConv was used to reduce the amount of model computation while increase the number of floating-point operations per second on the hardware device, resulting in lower latency. Additionally, the transposed convolution in the upsampling module was improved to bilinear interpolation to enhance model accuracy and reduce the number of parameters. Bilinear interpolation could process images smoother, making the processed images more realistic and clear. Finally, coordinate attention (CA) module was added to the encoder to introduce the attention mechanism, and the model was named CBPA-ENet. The CA mechanism not only focused on the channel information, but also keenly captured the orientation and position-sensitive information. The position information was embedded into the channel attention to globally encode the spatial information, capturing the channel information along one spatial direction while retaining the position information along the other spatial direction. The network could effectively enhance the attention to important regions in the image, and thus improve the quality and interpretability of segmentation results. [Results and Discussions] Trimming the third part resulted in a 28% decrease in FLOPs, a 41% decrease in parameters, and a 9 f/s increase in FPS. Improving the upsampling method to bilinear interpolation not only reduces the floating-point operation and parameters, but also slightly improves the segmentation accuracy of the model, increasing FPS by 4 f/s. Using P-Bottleneck instead of downsampling bottleneck structure and conventional bottleneck structure can reduce mIoU by only 0.04%, reduce FLOPs by 22%, reduce parameters by 16%, and increase FPS by 8 f/s. Adding CA mechanism to the encoder could only increase a small amount of FLOPs and parameters, improving the accuracy of the segmentation network. To verify the effectiveness of the improved segmentation algorithm, classic semantic segmentation networks of UNet, DeepLabV3+, PSPNet, and real-time semantic segmentation network LinkNet, DABNet were selected to train and validate. These six algorithms got quite high segmentation accuracy, among which UNet had the best mIoU and the mPA, but the model size was too large. The improved algorithm only accounts for 1% of the FLOPs and 0.41% of the parameters of UNet, and the mIoU and mPA were basically the same. Other classic semantic segmentation algorithms, such as DeepLabV3+, had similar accuracy to improved algorithms, but their large model size and slow inference speed were not conducive to embedded development. Although the real-time semantic segmentation algorithm LinkNet had a slightly higher mIoU, its FLOPs and parameters count were still far greater than the improved algorithm. Although the PSPNet model was relatively small, it was also much higher than the improved algorithm, and the mIoU and mPA were lower than the algorithm. The experimental results showed that the improved model achieved a mIoU of 98.61%. Compared with the original model, the number of parameters and FLOPs significantly decreased. Among them, the number of model parameters decreased by 51%, the FLOPs decreased by 49%, and the network operation speed increased by 38%. [Conclusions] The improved algorithm can accurately and quickly segment Chinese yam leaves, providing not only a more accurate means for determining Chinese yam phenotype data, but also a new method and approach for embedded research of Chinese yam. Using the model, the morphological feature data of Chinese yam leaves can be obtained more efficiently, providing a reliable foundation for further research and analysis.

Select

Topic--Development and Application of the Big Data Platform for Grain Production

Grain Production Big Data Platform: Progress and Prospects

YANGGuijun, ZHAOChunjiang, YANGXiaodong, YANGHao, HUHaitang, LONGHuiling, QIUZhengjun, LIXian, JIANGChongya, SUNLiang, CHENLei, ZHOUQingbo, HAOXingyao, GUOWei, WANGPei, GAOMeiling

Smart Agriculture. 2025, 7(2): 1-12. https://doi.org/10.12133/j.smartag.SA202409014

Abstract (58) PDF (15) HTML (49)

Knowledge map

Save

[Significance] The explosive development of agricultural big data has accelerated agricultural production into a new era of digitalization and intelligentialize. Agricultural big data is the core element to promote agricultural modernization and the foundation of intelligent agriculture. As a new productive forces, big data enhances the comprehensive intelligent management decision-making during the whole process of grain production. But it faces the problems such as the indistinct management mechanism of grain production big data resources, the lack of the full-chain decision-making algorithm system and big data platform for the whole process and full elements of grain production. [Progress] Grain production big data platform is a comprehensive service platform that uses modern information technologies such as big data, Internet of Things (IoT), remote sensing and cloud computing to provide intelligent decision-making support for the whole process of grain production based on intelligent algorithms for data collection, processing, analysis and monitoring related to grain production. In this paper, the progress and challenges in grain production big data, monitoring and decision-making algorithms are reviewed, as well as big data platforms in China and worldwide. With the development of the IoT and high-resolution multi-modal remote sensing technology, the massive agricultural big data generated by the "Space-Air-Ground" Integrated Agricultural Monitoring System, has laid an important foundation for smart agriculture and promoted the shift of smart agriculture from model-driven to data-driven. However, there are still some issues in field management decision-making, such as the requirements for high spatio-temporal resolution and timeliness of the information are difficult to meet, and the algorithm migration and localization methods based on big data need to be studied. In addition, the agricultural machinery operation and spatio-temporal scheduling algorithm based on remote sensing and IoT monitoring information to determine the appropriate operation time window and operation prescription, needs to be further developed, especially the cross-regional scheduling algorithm of agricultural machinery for summer harvest in China. Aiming to address the issues of non-bi-connected monitoring and decision-making algorithms in grain production, as well as the insufficient integration of agricultural machinery and information perception, a framework for the grain production big data intelligent platform based on digital twins is proposed. The platform leverages multi-source heterogeneous grain production big data and integrates a full-chain suit of standardized algorithms, including data acquisition, information extraction, knowledge map construction, intelligent decision-making, full-chain collaboration of agricultural machinery operations. It covers the typical application scenarios such as irrigation, fertilization, pests and disease management, emergency response to drought and flood disaster, all enabled by digital twins technology. [Conclusions and Prospects] The suggestions and trends for development of grain production big data platform are summarized in three aspects: (1) Creating an open, symbiotic grain production big data platform, with core characteristics such as open interface for crop and environmental sensors, maturity grading and a cloud-native packaging mechanism for core algorithms, highly efficient response to data and decision services; (2) Focusing on the typical application scenarios of grain production, take the exploration of technology integration and bi-directional connectivity as the base, and the intelligent service as the soul of the development path for the big data platform research; (3) The data-algorithm-service self-organizing regulation mechanism, the integration of decision-making information with the intelligent equipment operation, and the standardized, compatible and open service capabilities, can form the new quality productivity to ensure food safety, and green efficiency grain production.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms(Part 1)

Precision Target Spraying System Integrated with Remote Deep Learning Recognition Model for Cabbage Plant Centers

ZHANG Hui, HU Jun, SHI Hang, LIU Changxi, WU Miao

Smart Agriculture. 2024, 6(6): 85-95. https://doi.org/10.12133/j.smartag.SA202406013

Abstract (57) PDF (6) HTML (46)

Knowledge map

Save

[Objective] Spraying calcium can effectively prevent the occurrence of dry burning heart disease in Chinese cabbage. Accurately targeting spraying calcium can more effectively improve the utilization rate of calcium. Since the sprayer needs to move rapidly in the field, this can lead to over-application or under-application of the pesticide. This study aims to develop a targeted spray control system based on deep learning technology, explore the relationship between the advance speed, spray volume, and coverage of the sprayer, thereby addressing the uneven application issues caused by different nebulizer speeds by studying the real scenario of calcium administration to Chinese cabbage hearts. [Methods] The targeted spraying control system incorporates advanced sensors and computing equipment that were capable of obtaining real-time data regarding the location of crops and the surrounding environmental conditions. This data allowed for dynamic adjustments to be made to the spraying system, ensuring that pesticides were delivered with high precision. To further enhance the system's real-time performance and accuracy, the YOLOv8 object detection model was improved. A Ghost-Backbone lightweight network structure was introduced, integrating remote sensing technologies along with the sprayer's forward speed and the frequency of spray responses. This innovative combination resulted in the creation of a YOLOv8-Ghost-Backbone lightweight model specifically tailored for agricultural applications. The model operated on the Jetson Xavier NX controller, which was a high-performance, low-power computing platform designed for edge computing. The system was allowed to process complex tasks in real time directly in the field. The targeted spraying system was composed of two essential components: A pressure regulation unit and a targeted control unit. The pressure regulation unit was responsible for adjusting the pressure within the spraying system to ensure that the output remains stable under various operational conditions. Meanwhile, the targeted control unit played a crucial role in precisely controlling the direction, volume, and coverage of the spray to ensure that the pesticide was applied effectively to the intended areas of the plants. To rigorously evaluate the performance of the system, a series of intermittent spray tests were conducted. During these tests, the forward speed of the sprayer was gradually increased, allowing to assess how well the system responded to changes in speed. Throughout the testing phase, the response frequency of the electromagnetic valve was measured to calculate the corresponding spray volume for each nozzle. [Results and Conclusions] The experimental results indicated that the overall performance of the targeted spraying system was outstanding, particularly under conditions of high-speed operation. By meticulously recording the response times of the three primary components of the system, the valuable data were gathered. The average time required for image processing was determined to be 29.50 ms, while the transmission of decision signals took an average of 6.40 ms. The actual spraying process itself required 88.83 ms to complete. A thorough analysis of these times revealed that the total response time of the spraying system lagged by approximately 124.73 ms when compared to the electrical signal inputs. Despite the inherent delays, the system was able to maintain a high level of spraying accuracy by compensating for the response lag of the electromagnetic valve. Specifically, when tested at a speed of 7.2 km/h, the difference between the actual spray volume delivered and the required spray volume, after accounting for compensation, was found to be a mere 0.01 L/min. This minimal difference indicates that the system met the standard operational requirements for effective pesticide application, thereby demonstrating its precision and reliability in practical settings. [Conclusions] In conclusion, this study developed and validated a deep learning-based targeted spraying control system that exhibited excellent performance regarding both spraying accuracy and response speed. The system serves as a significant technical reference for future endeavors in agricultural automation. Moreover, the research provides insights into how to maintain consistent spraying effectiveness and optimize pesticide utilization efficiency by dynamically adjusting the spraying system as the operating speed varies. The findings of this research will offer valuable experiences and guidance for the implementation of agricultural robots in the precise application of pesticides, with a particular emphasis on parameter selection and system optimization.

Select

Technology and Method

Reconstruction of U.S. Regional-Scale Soybean SIF Based on MODIS Data and BP Neural Network

YAOJianen, LIUHaiqiu, YANGMan, FENGJinying, CHENXiu, ZHANGPeipei

Smart Agriculture. 2024, 6(5): 40-50. https://doi.org/10.12133/j.smartag.SA202309006

Abstract (53) PDF (3) HTML (37)

Knowledge map

Save

[Objective] Sunlight-induced chlorophyll fluorescence (SIF) data obtained from satellites suffer from issues such as low spatial and temporal resolution, and discrete footprint because of the limitations imposed by satellite orbits. To address these problems, obtaining higher resolution SIF data, most reconstruction studies are based on low-resolution satellite SIF. Moreover, the spatial resolution of most SIF reconstruction products is still not enough to be directly used for the study of crop photosynthetic rate at the regional scale. Although some SIF products boast elevated resolutions, but these derive not from the original satellite SIF data reconstruct but instead evolve from secondary reconstructions based on preexisting SIF reconstruction products. Satellite OCO-2 (The Orbiting Carbon Obsevatory-2) equipped with a high-resolution spectrometer, OCO-2 SIF has higher spatial resolution (1.29×2.25 km) compared to other original SIF products, making it suitable in advancing the realm of high-resolution SIF data reconstruction, particularly within the context of regional-scale crop studies. [Methods] This research primarily exploration SIF reconstruct at the regional scale, mainly focused on the partial soybean planting regions nestled within the United States. The selection of MODIS raw data hinged on a meticulous consideration of environmental conditions, the distinctive physiological attributes of soybeans, and an exhaustive evaluation of factors intricately linked to OCO-2 SIF within these soybean planting regions. The primary tasks of this research encompassed reconstructing high resolution soybean SIF while concurrently executing a rigorous assessment of the reconstructed SIF's quality. During the dataset construction process, amalgamated SIF data from multiple soybean planting regions traversed by the OCO-2 satellite's footprint to retain as many of the available original SIF samples as possible. This approach provided the subsequent SIF reconstruction model with a rich source of SIF data. SIF data obtained beneath the satellite's trajectory were matched with various MODIS datasets, including enhanced vegetation index (EVI), fraction of photosynthetically active radiation (FPAR), and land surface temperature (LST), resulting in the creation of a multisource remote sensing dataset ultimately used for model training. Because of the multisource remote sensing dataset encompassed the most relevant explanatory variables within each SIF footprint coverage area concerning soybean physiological structure and environmental conditions. Through the activation functions in the BP neural network, it enhanced the understanding of the complex nonlinear relationships between the original SIF data and these MODIS products. Leveraging these inherent nonlinear relationships, compared and analyzed the effects of different combinations of explanatory variables on SIF reconstruction, mainly analyzing the three indicators of goodness of fit R², root mean square error RMSE, and mean absolute error MAE, and then selecting the best SIF reconstruction model, generate a regional scale, spatially continuous, and high temporal resolution (500 m, 8 d) soybean SIF reconstruction dataset (BPSIF). [Results and Discussions] The research findings confirmed the strong performance of the SIF reconstruction model in predicting soybean SIF. After simultaneously incorporating EVI, FPAR, and LST as explanatory variables to model, achieved a goodness of fit with an R² value of 0.84, this statistical metric validated the model's capability in predicting SIF data, it also reflected that the reconstructed 8 d time resolution of SIF data's reliability of applying to small-scale agricultural crop photosynthesis research with 500 m×500 m spatial scale. Based on this optimal model, generated the reconstructed SIF product (BPSIF). The Pearson correlation coefficient between the original OCO-2 SIF data and MODIS GPP stood were at a modest 0.53. In stark contrast, the correlation coefficient between BPSIF and MODIS Gross Primary Productivity (GPP) rosed significantly to 0.80. The increased correlation suggests that BPSIF could more accurately reflect the dynamic changes in GPP during the soybean growing season, making it more reliable compared to the original SIF data. Selected soybean planting areas in the United States with relatively single crop cultivation as the research area, based on high spatial resolution (1.29 km×2.25 km) OCO-2 SIF data, greatly reduced vegetation heterogeneity under a single SIF footprint. [Conclusions] The BPSIF proposed has significantly enhancing the regional and temporal continuity of OCO-2 SIF while preserving the time and spatial attributes contained in the original SIF dataset. Within the study area, BPSIF exhibits a significantly improved correlation with MODIS GPP compared to the original OCO-2 SIF. The proposed OCO-2 SIF data reconstruction method in this study holds the potential to provide a more reliable SIF dataset. This dataset has the potential to drive further understanding of soybean SIF at finer spatial and temporal scales, as well as find its relationship with soybean GPP.

Select

Technology and Method

Suitable Sowing Date Method of Winter Wheat at the County Level Based on ECMWF Long-Term Reanalysis Data

LIURuixuan, ZHANGFangzhao, ZHANGJibo, LIZhenhai, YANGJuntao

Smart Agriculture. 2024, 6(5): 51-60. https://doi.org/10.12133/j.smartag.SA202309019

Abstract (51) PDF (10) HTML (37)

Knowledge map

Save

[Objective] Acurately determining the suitable sowing date for winter wheat is of great significance for improving wheat yield and ensuring national food security. Traditional visual interpretation method is not only time-consuming and labor-intensive, but also covers a relatively small area. Remote sensing monitoring, belongs to post-event monitoring, exhibits a time lag. The aim of this research is to use the temperature threshold method and accumulated thermal time requirements for wheat leaves appearance method to analyze the suitable sowing date for winter wheat in county-level towns under the influence of long-term sequence of climate warming. [Methods] The research area were various townships in Qihe county, Shandong province. Based on European centre for medium-range weather forecasts (ECMWF) reanalysis data from 1997 to 2022, 16 meteorological data grid points in Qihe county were selected. Firstly, the bilinear interpolation method was used to interpolate the temperature data of grid points into the approximate center points of each township in Qihe county, and the daily average temperatures for each township were obtained. Then, temperature threshold method was used to determine the final dates of stable passage through 18, 16, 14 and 0 ℃. Key sowing date indicators such as suitable sowing temperature for different wheat varieties, growing degree days (GDD)≥0 ℃ from different sowing dates to before overwintering, and daily average temperature over the years were used for statistical analysis of the suitable sowing date for winter wheat. Secondly, the accumulated thermal time requirements for wheat leaves appearance method was used to calculate the appropriate date of GDD for strong seedlings before winter by moving forward from the stable date of dropping to 0 ℃. Accumulating the daily average temperatures above 0 ℃ to the date when the GDD above 0 ℃ was required for the formation of strong seedlings of wheat, a range of ±3 days around this calculated date was considered the theoretical suitable sowing date. Finally, combined with actual production practices, the appropriate sowing date of winter wheat in various townships of Qihe county was determined under the trend of climate warming. [Results and Discussions] The results showed that, from November 1997 to early December 2022, winter and annual average temperatures in Qihe county had all shown an upward trend, and there was indeed a clear trend of climate warming in various townships of Qihe county. Judging from the daily average temperature over the years, the temperature fluctuation range in November was the largest in a year, with a maximum standard deviation was 2.61 ℃. This suggested a higher likelihood of extreme weather conditions in November. Therefore, it was necessary to take corresponding measures to prevent and reduce disasters in advance to avoid affecting the growth and development of wheat. In extreme weather conditions, it was limited to determine the sowing date only by temperature or GDD. In cold winter years, it was too one-sided to consider only from the perspective of GDD. It was necessary to expand the range of GDD required for winter wheat before overwintering based on temperature changes to ensure the normal growth and development of winter wheat. The suitable sowing date for semi winter wheat obtained by temperature threshold method was from October 4th to October 16th, and the suitable sowing date for winter wheat was from September 27th to October 4th. Taking into account the GDD required for the formation of strong seedlings before winter, the suitable sowing date for winter wheat was from October 3rd to October 13th, and the suitable sowing date for semi winter wheat was from October 15th to October 24th, which was consisted with the suitable sowing date for winter wheat determined by the accumulated thermal time requirements for wheat leaves appearance method. Considering the winter wheat varieties planted in Qihe county, the optimal sowing date for winter wheat in Qihe county was from October 3rd to October 16th, and the optimal sowing date was from October 5th to October 13th. With the gradual warming of the climate, the suitable sowing date for wheat in various townships of Qihe county in 2022 was later than that in 2002. However, the sowing date for winter wheat was still influenced by factors such as soil moisture, topography, and seeding quality. The suitable sowing date for a specific year still needed to be adjusted to local conditions and flexibly sown based on the specific situation of that year. [Conclusions] The experimental results proved the feasibility of the temperature threshold method and accumulated thermal time requirements for wheat leaves appearance method in determining the suitable sowing date for winter wheat. The temperature trend can be used to identify cold or warm winters, and the sowing date can be adjusted in a timely manner to enhance wheat yield and reduce the impact of excessively high or low temperatures on winter wheat. The research results can not only provide decision-making reference for winter wheat yield assessment in Qihe county, but also provide an important theoretical basis for scientifically arrangement of agricultural production.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms(Part 1)

Detection Method of Apple Alternaria Leaf Spot Based on Deep-Semi-NMF

FU Zhuojun, HU Zheng, DENG Yangjun, LONG Chenfeng, ZHU Xinghui

Smart Agriculture. 2024, 6(6): 144-154. https://doi.org/10.12133/j.smartag.SA202409001

Abstract (44) PDF (7) HTML (30)

Knowledge map

Save

[Objective] Apple Alternaria leaf spot can easily lead to premature defoliation of apple tree leaves, thereby affecting the quality and yield of apples. Consequently, accurately detecting of the disease has become a critical issue in the precise prevention and control of apple tree diseases. Due to factors such as backlighting, traditional image segmentation-based methods for detecting disease spots struggle to accurately identify the boundaries of diseased areas against complex backgrounds. There is an urgent need to develop new methods for detecting apple Alternaria leaf spot, which can assist in the precise prevention and control of apple tree diseases. [Methods] A novel detection method named Deep Semi-Non-negative Matrix Factorization-based Mahalanobis Distance Anomaly Detection (DSNMFMAD) was proposed, which combines Deep Semi-Non-negative Matrix Factorization (DSNMF) with Mahalanobis distance for robust anomaly detection in complex image backgrounds. The proposed method began by utilizing DSNMF to extract low-rank background components and sparse anomaly features from the apple Alternaria leaf spot images. This enabled effective separation of the background and anomalies, mitigating interference from complex background noise while preserving the non-negativity constraints inherent in the data. Subsequently, Mahalanobis distance was employed, based on the Singular Value Decomposition (SVD) feature subspace, to construct a lesion detector. The detector identified lesions by calculating the anomaly degree of each pixel in the anomalous regions. The apple tree leaf disease dataset used was provided by PaddlePaddle AI-Studio. Each image in the dataset has a resolution of 512×512 pixels, in RGB color format, and was in JPEG format. The dataset was captured in both laboratory and natural environments. Under laboratory conditions, 190 images of apple leaves with spot-induced leaf drop were used, while 237 images were collected under natural conditions. Furthermore, the dataset was augmented with geometric transformations and random changes in brightness, contrast, and hue, resulting in 1 145 images under laboratory conditions and 1 419 images under natural conditions. These images reflect various real-world scenarios, capturing apple leaves at different stages of maturity, in diverse lighting conditions, angles, and noise environments. This diversed dataset ensured that the proposed method could be tested under a wide range of practical conditions, providing a comprehensive evaluation of its effectiveness in detecting apple Alternaria leaf spot. [Results and Discussions] DSNMFMAD demonstrated outstanding performance under both laboratory and natural conditions. A comparative analysis was conducted with several other detection methods, including GRX (Reed-Xiaoli detector), LRX (Local Reed-Xiaoli detector), CRD (Collaborative-Representation-Based Detector), LSMAD (LRaSMD-Based Mahalanobis Distance Detector), and the deep learning model Unet. The results demonstrated that DSNMFMAD exhibited superior performance in the laboratory environment. The results demonstrated that DSNMFMAD attained a recognition accuracy of 99.8% and a detection speed of 0.087 2 s/image. The accuracy of DSNMFMAD was found to exceed that of GRX, LRX, CRD, LSMAD, and Unet by 0.2%, 37.9%, 10.3%, 0.4%, and 24.5%, respectively. Additionally, the DSNMFMAD exhibited a substantially superior detection speed in comparison to LRX, CRD, LSMAD, and Unet, with an improvement of 8.864, 107.185, 0.309, and 1.565 s, respectively. In a natural environment, where a dataset of 1 419 images of apple Alternaria leaf spot was analysed, DSNMFMAD demonstrated an 87.8% recognition accuracy, with an average detection speed of 0.091 0 s per image. In this case, its accuracy outperformed that of GRX, LRX, CRD, LSMAD, and Unet by 2.5%, 32.7%, 5%, 14.8%, and 3.5%, respectively. Furthermore, the detection speed was faster than that of LRX, CRD, LSMAD, and Unet by 2.898, 132.017, 0.224, and 1.825 s, respectively. [Conclusions] The DSNMFMAD proposed in this study was capable of effectively extracting anomalous parts of an image through DSNMF and accurately detecting the location of apple Alternaria leaf spot using a constructed lesion detector. This method achieved higher detection accuracy compared to the benchmark methods, even under complex background conditions, demonstrating excellent performance in lesion detection. This advancement could provide a valuable technical reference for the detection and prevention of apple Alternaria leaf spot.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms (Part 2)

Chinese Kiwifruit Text Named Entity Recognition Method Based on Dual-Dimensional Information and Pruning

QIZijun, NIUDangdang, WUHuarui, ZHANGLilin, WANGLunfeng, ZHANGHongming

Smart Agriculture. 2025, 7(1): 44-56. https://doi.org/10.12133/j.smartag.SA202410022

Abstract (39) PDF (7) HTML (26)

Knowledge map

Save

[Objective] Chinese kiwifruit texts exhibit unique dual-dimensional characteristics. The cross-paragraph dependency is complex semantic structure, whitch makes it challenging to capture the full contextual relationships of entities within a single paragraph, necessitating models capable of robust cross-paragraph semantic extraction to comprehend entity linkages at a global level. However, most existing models rely heavily on local contextual information and struggle to process long-distance dependencies, thereby reducing recognition accuracy. Furthermore, Chinese kiwifruit texts often contain highly nested entities. This nesting and combination increase the complexity of grammatical and semantic relationships, making entity recognition more difficult. To address these challenges, a novel named entity recognition (NER) method, KIWI-Coord-Prune(kiwifruit-CoordKIWINER-PruneBi-LSTM) was proposed in this research, which incorporated dual-dimensional information processing and pruning techniques to improve recognition accuracy. [Methods] The proposed KIWI-Coord-Prune model consisted of a character embedding layer, a CoordKIWINER layer, a PruneBi-LSTM layer, a self-attention mechanism, and a CRF decoding layer, enabling effective entity recognition after processing input character vectors. The CoordKIWINER and PruneBi-LSTM modules were specifically designed to handle the dual-dimensional features in Chinese kiwifruit texts. The CoordKIWINER module applied adaptive average pooling in two directions on the input feature maps and utilized convolution operations to separate the extracted features into vertical and horizontal branches. The horizontal and vertical features were then independently extracted using the Criss-Cross Attention (CCNet) mechanism and Coordinate Attention (CoordAtt) mechanism, respectively. This module significantly enhanced the model's ability to capture cross-paragraph relationships and nested entity structures, thereby generating enriched character vectors containing more contextual information, which improved the overall representation capability and robustness of the model. The PruneBi-LSTM module was built upon the enhanced dual-dimensional vector representations and introduced a pruning strategy into Bi-LSTM to effectively reduce redundant parameters associated with background descriptions and irrelevant terms. This pruning mechanism not only enhanced computational efficiency while maintaining the dynamic sequence modeling capability of Bi-LSTM but also improved inference speed. Additionally, a dynamic feature extraction strategy was employed to reduce the computational complexity of vector sequences and further strengthen the learning capacity for key features, leading to improved recognition of complex entities in kiwifruit texts. Furthermore, the pruned weight matrices become sparser, significantly reducing memory consumption. This made the model more efficient in handling large-scale agricultural text-processing tasks, minimizing redundant information while achieving higher inference and training efficiency with fewer computational resources. [Results and Discussions] Experiments were conducted on the self-built KIWIPRO dataset and four public datasets: People's Daily, ClueNER, Boson, and ResumeNER. The proposed model was compared with five advanced NER models: LSTM, Bi-LSTM, LR-CNN, Softlexicon-LSTM, and KIWINER. The experimental results showed that KIWI-Coord-Prune achieved F₁-Scores of 89.55%, 91.02%, 83.50%, 83.49%, and 95.81%, respectively, outperforming all baseline models. Furthermore, controlled variable experiments were conducted to compare and ablate the CoordKIWINER and PruneBi-LSTM modules across the five datasets, confirming their effectiveness and necessity. Additionally, the impact of different design choices was explored for the CoordKIWINER module, including direct fusion, optimized attention mechanism fusion, and network structure adjustment residual optimization. The experimental results demonstrated that the optimized attention mechanism fusion method yielded the best performance, which was ultimately adopted in the final model. These findings highlight the significance of properly designing attention mechanisms to extract dual-dimensional features for NER tasks. Compared to existing methods, the KIWI-Coord-Prune model effectively addressed the issue of underutilized dual-dimensional information in Chinese kiwifruit texts. It significantly improved entity recognition performance for both overall text structures and individual entity categories. Furthermore, the model exhibited a degree of generalization capability, making it applicable to downstream tasks such as knowledge graph construction and question-answering systems. [Conclusions] This study presents an novel NER approach for Chinese kiwifruit texts, which integrating dual-dimensional information extraction and pruning techniques to overcome challenges related to cross-paragraph dependencies and nested entity structures. The findings offer valuable insights for researchers working on domain-specific NER and contribute to the advancement of agriculture-focused natural language processing applications. However, two key limitations remain: 1) The balance between domain-specific optimization and cross-domain generalization requires further investigation, as the model's adaptability to non-agricultural texts has yet to be empirically validated; 2) the multilingual applicability of the model is currently limited, necessitating further expansion to accommodate multilingual scenarios. Future research should focus on two key directions: 1) Enhancing domain robustness and cross-lingual adaptability by incorporating diverse textual datasets and leveraging pre-trained multilingual models to improve generalization, and 2) Validating the model's performance in multilingual environments through transfer learning while refining linguistic adaptation strategies to further optimize recognition accuracy.

Select

Overview Article

Advances, Problems and Challenges of Precise Estrus Perception and Intelligent Identification Technology for Cows

ZHANGZhiyong, CAOShanshan, KONGFantao, LIUJifang, SUNWei

Smart Agriculture. 2025, 7(3): 48-68. https://doi.org/10.12133/j.smartag.SA202305005

Abstract (37) PDF (8) HTML (37)

Knowledge map

Save

[Significance] Estrus monitoring and identification in cows is a crucial aspect of breeding management in beef and dairy cattle farming. Innovations in precise sensing and intelligent identification methods and technologies for estrus in cows are essential not only for scientific breeding, precise management, and smart breeding on a population level but also play a key supportive role in health management, productivity enhancement, and animal welfare improvement at the individual level. The aims are to provide a reference for scientific management and the study of modern production technologies in the beef and dairy cattle industry, as well as theoretical methodologies for the research and development of key technologies in precision livestock farming. [Progress] Based on describing the typical characteristics of normal and abnormal estrus in cows, this paper systematically categorizes and summarizes the recent research progress, development trends, and methodological approaches in estrus monitoring and identification technologies, focusing on the monitoring and diagnosis of key physiological signs and behavioral characteristics during the estrus period. Firstly, the paper outlines the digital monitoring technologies for three critical physiological parameters, body temperature, rumination, and activity levels, and their applications in cow estrus monitoring and identification. It analyzes the intrinsic reasons for performance bottlenecks in estrus monitoring models based on body temperature, compares the reliability issues faced by activity-based estrus monitoring, and addresses the difficulties in balancing model generalization and robustness design. Secondly, the paper examines the estrus sensing and identification technologies based on three typical behaviors: feeding, vocalization, and sexual desire. It highlights the latest applications of new artificial intelligence technologies, such as computer vision and deep learning, in estrus monitoring and points out the critical role of these technologies in improving the accuracy and timeliness of monitoring. Finally, the paper focuses on multi-factor fusion technologies for estrus perception and identification, summarizing how different researchers combine various physiological and behavioral parameters using diverse monitoring devices and algorithms to enhance accuracy in estrus monitoring. It emphasizes that multi-factor fusion methods can improve detection rates and the precision of identification results, being more reliable and applicable than single-factor methods. The importance and potential of multi-modal information fusion in enhancing monitoring accuracy and adaptability are underlined. The current shortcomings of multi-factor information fusion methods are analyzed, such as the potential impact on animal welfare from parameter acquisition methods, the singularity of model algorithms used for representing multi-factor information fusion, and inadequacies in research on multi-factor feature extraction models and estrus identification decision algorithms. [Conclusions and Prospects] From the perspectives of system practicality, stability, environmental adaptability, cost-effectiveness, and ease of operation, several key issues are discussed that need to be addressed in the further research of precise sensing and intelligent identification technologies for cow estrus within the context of high-quality development in digital livestock farming. These include improving monitoring accuracy under weak estrus conditions, overcoming technical challenges of audio extraction and voiceprint construction amidst complex background noise, enhancing the adaptability of computer vision monitoring technologies, and establishing comprehensive monitoring and identification models through multi-modal information fusion. It specifically discusses the numerous challenges posed by these issues to current technological research and explains that future research needs to focus not only on improving the timeliness and accuracy of monitoring technologies but also on balancing system cost-effectiveness and ease of use to achieve a transition from the concept of smart farming to its practical implementation.

Select

Topic--Intelligent Agricultural Knowledge Services and Smart Unmanned Farms (Part 2)

Parametric Reconstruction Method of Wheat Leaf Curved Surface Based on Three-Dimensional Point Cloud

ZHUShunyao, QUHongjun, XIAQian, GUOWei, GUOYa

Smart Agriculture. 2025, 7(1): 85-96. https://doi.org/10.12133/j.smartag.SA202410004

Abstract (35) PDF (13) HTML (29)

Knowledge map

Save

[Objective] Plant leaf shape is an important part of plant architectural model. Establishment of a three-dimensional structural model of leaves may assist simulating and analyzing plant growth. However, existing leaf modeling approaches lack interpretability, invertibility, and operability, which limit the estimation of model parameters, the simulation of leaf shape, the analysis and interpretation of leaf physiology and growth state, and model reusage. Aiming at the interoperability between three-dimensional structure representation and mathematical model parameters, this study paid attention to three aspects in wheat leaf shape parametric reconstruction: (1) parameter-driven model structure, (2) model parameter inversion, and (3) parameter dynamic mapping during growth. Based on this, a set of parameter-driven and point cloud inversion model for wheat leaf interoperability was proposed in this study. [Methods] A parametric surface model of a wheat leaf with seven characteristic parameters by using parametric modeling technology was built, and the forward parametric construction of the wheat leaf structure was realized. Three parameters, maximum leaf width, leaf length, and leaf shape factor, were used to describe the basic shape of the blade on the leaf plane. On this basis, two parameters, namely the angle between stems and leaves and the curvature degree, were introduced to describe the bending characteristics of the main vein of the blade in the three-dimensional space. Two parameters, namely the twist angle around the axis and the twist deviation angle around the axis, were introduced to represent the twisted structure of the leaf blade along the vein. The reverse parameter estimation module was built according to the surface model. The point cloud was divided by the uniform segmentation method along the Y-axis, and the veins were fit by a least squares regression method. Then, the point cloud was re-segmented according to the fit vein curve. Subsequently, the rotation angle was precisely determined through the segment-wise transform estimation method, with all parameters being optimally fit using the RANSAC regression algorithm. To validate the reliability of the proposed methodology, a set of sample parameters was randomly generated, from which corresponding sample point clouds were synthesized. These sample point clouds were then subjected to estimation using the described method. Then error analyzing was carried out on the estimation results. Three-dimensional imaging technology was used to collect the point clouds of Zhengmai 136, Yangmai 34, and Yanmai 1 samples. After noise reduction and coordinate registration, the model parameters were inverted and estimated, and the reconstructed point clouds were produced using the parametric model. The reconstruction error was validated by calculating the dissimilarity, represented by the Chamfer Distance, between the reconstructed point cloud and the measured point cloud. [Results and Discussions] The model could effectively reconstruct wheat leaves, and the average deviation of point cloud based parametric reconstruction results was about 1.2 mm, which had a high precision. Parametric modeling technology based on prior knowledge and point cloud fitting technology based on posterior data was integrated in this study to construct a digital twin model of specific species at the 3D structural level. Although some of the detailed characteristics of the leaves were moderately simplified, the geometric shape of the leaves could be highly restored with only a few parameters. This method was not only simple, direct and efficient, but also had more explicit geometric meaning of the obtained parameters, and was both editable and interpretable. In addition, the practice of using only tools such as rulers to measure individual characteristic parameters of plant organs in traditional research was abandoned in this study. High-precision point cloud acquisition technology was adopted to obtain three-dimensional data of wheat leaves, and pre-processing work such as point cloud registration, segmentation, and annotation was completed, laying a data foundation for subsequent research. [Conclusions] There is interoperability between the reconstructed model and the point cloud, and the parameters of the model can be flexibly adjusted to generate leaf clusters with similar shapes. The inversion parameters have high interpretability and can be used for consistent and continuous estimation of point cloud time series. This research is of great value to the simulation analysis and digital twinning of wheat leaves.

Select

Technology and Method

Lightweight Tea Shoot Picking Point Recognition Model Based on Improved DeepLabV3+

HUChengxi, TANLixin, WANGWenyin, SONGMin

Smart Agriculture. 2024, 6(5): 119-127. https://doi.org/10.12133/j.smartag.SA202403016

Abstract (32) PDF (8) HTML (30)

Knowledge map

Save

[Objective] The picking of famous and high-quality tea is a crucial link in the tea industry. Identifying and locating the tender buds of famous and high-quality tea for picking is an important component of the modern tea picking robot. Traditional neural network methods suffer from issues such as large model size, long training times, and difficulties in dealing with complex scenes. In this study, based on the actual scenario of the Xiqing Tea Garden in Hunan Province, proposes a novel deep learning algorithm was proposed to solve the precise segmentation challenge of famous and high-quality tea picking points. [Methods] The primary technical innovation resided in the amalgamation of a lightweight network architecture, MobilenetV2, with an attention mechanism known as efficient channel attention network (ECANet), alongside optimization modules including atrous spatial pyramid pooling (ASPP). Initially, MobilenetV2 was employed as the feature extractor, substituting traditional convolution operations with depth wise separable convolutions. This led to a notable reduction in the model's parameter count and expedited the model training process. Subsequently, the innovative fusion of ECANet and ASPP modules constituted the ECA_ASPP module, with the intention of bolstering the model's capacity for fusing multi-scale features, especially pertinent to the intricate recognition of tea shoots. This fusion strategy facilitated the model's capability to capture more nuanced features of delicate shoots, thereby augmenting segmentation accuracy. The specific implementation steps entailed the feeding of image inputs through the improved network, whereupon MobilenetV2 was utilized to extract both shallow and deep features. Deep features were then fused via the ECA_ASPP module for the purpose of multi-scale feature integration, reinforcing the model's resilience to intricate backgrounds and variations in tea shoot morphology. Conversely, shallow features proceeded directly to the decoding stage, undergoing channel reduction processing before being integrated with upsampled deep features. This divide-and-conquer strategy effectively harnessed the benefits of features at differing levels of abstraction and, furthermore, heightened the model's recognition performance through meticulous feature fusion. Ultimately, through a sequence of convolutional operations and upsampling procedures, a prediction map congruent in resolution with the original image was generated, enabling the precise demarcation of tea shoot harvesting points. [Results and Discussions] The experimental outcomes indicated that the enhanced DeepLabV3+ model had achieved an average Intersection over Union (IoU) of 93.71% and an average pixel accuracy of 97.25% on the dataset of tea shoots. Compared to the original model based on Xception, there was a substantial decrease in the parameter count from 54.714 million to a mere 5.818 million, effectively accomplishing a significant lightweight redesign of the model. Further comparisons with other prevalent semantic segmentation networks revealed that the improved model exhibited remarkable advantages concerning pivotal metrics such as the number of parameters, training duration, and average IoU, highlighting its efficacy and precision in the domain of tea shoot recognition. This considerable decreased in parameter numbers not only facilitated a more resource-economical deployment but also led to abbreviated training periods, rendering the model highly suitable for real-time implementations amidst tea garden ecosystems. The elevated mean IoU and pixel accuracy attested to the model's capacity for precise demarcation and identification of tea shoots, even amidst intricate and varied datasets, demonstrating resilience and adaptability in pragmatic contexts. [Conclusions] This study effectively implements an efficient and accurate tea shoot recognition method through targeted model improvements and optimizations, furnishing crucial technical support for the practical application of intelligent tea picking robots. The introduction of lightweight DeepLabV3+ not only substantially enhances recognition speed and segmentation accuracy, but also mitigates hardware requirements, thereby promoting the practical application of intelligent picking technology in the tea industry.

Select

Information Processing and Decision Making

The Lightweight Bee Pollination Recognition Model Based On YOLOv10n-CHL

CHANGJian, WANGBingbing, YINLong, LIYanqing, LIZhaoxin, LIZhuang

Smart Agriculture. 2025, 7(3): 185-198. https://doi.org/10.12133/j.smartag.SA202503033

Abstract (32) PDF (8) HTML (30)

Knowledge map

Save

[Objective] Bee pollination is pivotal to plant reproduction and crop yield, making its identification and monitoring highly significant for agricultural production. However, practical detection of bee pollination poses various challenges, including the small size of bee targets, their low pixel occupancy in images, and the complexity of floral backgrounds. Aimed to scientifically evaluate pollination efficiency, accurately detect the pollination status of flowers, and provide reliable data to guide flower and fruit thinning in orchards, ultimately supports the scientific management of bee colonies and enhances agricultural efficiency, a lightweight recognition model that can effectively overcome the above obstacles was proposed, thereby advancing the practical application of bee pollination detection technology in smart agriculture. [Methods] A specialized bee pollination dataset was constructed comprising three flower types: strawberry, blueberry, and chrysanthemum. High-resolution cameras were used to record videos of the pollination process, which were then subjected to frame sampling to extract representative images. These initial images underwent manual screening to ensure quality and relevance. To address challenges such as limited data diversity and class imbalance, a comprehensive data augmentation strategy was employed. Techniques including rotation, flipping, brightness adjustment, and mosaic augmentation were applied, significantly expanding the dataset's size and variability. The enhanced dataset was subsequently split into training and validation sets at an 8:2 ratio to ensure robust model evaluation. The base detection model was built upon an improved YOLOv10n architecture. The conventional C2f module in the backbone was replaced with a novel cross stage partial network_multi-scale edge information enhance (CSP_MSEE) module, which synergizes the cross-stage partial connections from cross stage partial network (CSPNet) with a multi-scale edge enhancement strategy. This design greatly improved feature extraction, particularly in scenarios involving fine-grained structures and small-scale targets like bees. For the neck, a hybrid-scale feature pyramid network (HS-FPN) was implemented, incorporating a channel attention (CA) mechanism and a dimension matching (DM) module to refine and align multi-scale features. These features were further integrated through a selective feature fusion (SFF) module, enabling the effective combination of low-level texture details and high-level semantic representations. The detection head was replaced with the lightweight shared detail enhanced convolutional detection head (LSDECD), an enhanced version of the Lightweight shared convolutional detection head (LSCD) detection head. It incorporated detail enhancement convolution (DEConv) from DEA-Net to improve the extraction of fine-grained bee features. Additionally, the standard convolution_groupnorm (Conv_GN) layers were replaced with detail enhancement convolution_ groupnorm (DEConv_GN), significantly reducing model parameters and enhancing the model's sensitivity to subtle bee behaviors. This lightweight yet accurate model design made it highly suitable for real-time deployment on resource-constrained edge devices in agricultural environments. [Results and Discussions] Experimental results on the three bee pollination datasets: strawberry, blueberry, and chrysanthemum, demonstrated the effectiveness of the proposed improvements over the baseline YOLOv10n model. The enhanced model achieved significant reductions in computational overhead, lowering the computational complexity by 3.1 GFLOPs and the number of parameters by 1.3 M. The computational cost of the improved model reached 5.1 GFLOPS, and the number of parameters was 1.3 M. These reductions contribute to improved efficiency, making the model more suitable for deployment on edge devices with limited processing capabilities, such as mobile platforms or embedded systems used in agricultural monitoring. In terms of detection performance, the improved model showed consistent gains across all three datasets. Specifically, the recall rates reached 82.6% for strawberry flowers, 84.0% for blueberry flowers, and 84.8% for chrysanthemum flowers. Corresponding mAP50 (Mean Average Precision at IoU threshold of 0.5) scores were 89.3%, 89.5%, and 88.0%, respectively. Compared to the original YOLOv10n model, these results marked respective improvements of 2.1% in recall and 1.7% in mAP50 on the strawberry dataset, 2.0% and 2.6% on the blueberry dataset, and 2.1% and 2.2% on the chrysanthemum dataset. [Conclusions] The proposed YOLOv10n-CHL lightweight bee pollination detection model, through coordinated enhancements at multiple architectural levels, achieved notable improvements in both detection accuracy and computational efficiency across multiple bee pollination datasets. The model significantly improved the detection performance for small objects while substantially reducing computational overhead, facilitating its deployment on edge computing platforms such as drones and embedded systems. This research could provide a solid technical foundation for the precise monitoring of bee pollination behavior and the advancement of smart agriculture. Nevertheless, the model's adaptability to extreme lighting and complex weather conditions remains an area for improvement. Future work will focus on enhancing the model's robustness in these scenarios to support its broader application in real-world agricultural environments.

Select

Information Processing and Decision Making

U-Net Greenhouse Sweet Cherry Image Segmentation Method Integrating PDE Plant Temporal Image Contrastive Learning and GCN Skip Connections

HULingyan, GUORuiya, GUOZhanjun, XUGuohui, GAIRongli, WANGZumin, ZHANGYumeng, JUBowen, NIEXiaoyu

Smart Agriculture. 2025, 7(3): 131-142. https://doi.org/10.12133/j.smartag.SA202502008

Abstract (22) PDF (5) HTML (20)

Knowledge map

Save

[Objective] Within the field of plant phenotyping feature extraction, the accurate delineation of small targets boundaries and the adequate recovery of spatial details during upsampling operations have long been recognized as significant obstacles hindering progress. To address these limitations, an improved U-Net architecture designed for greenhouse sweet cherry image segmentation. [Methods] Taking temporal phenotypic images of sweet cherries as the research subject, the U-Net segmentation model was employed to delineate the specific organ regions of the plant. This architecture was referred to as the U-Net integrating self-supervised contrastive learning method for plant time-series images with priori distance embedding (PDE) pre-training and graph convolutional networks (GCN ) skip connection for greenhouse sweet cherry image segmentation. To accelerate model convergence, the pre-trained weights derived from the PDE plant temporal image contrastive learning method were transferred to. Concurrently, the incorporation of a GCN local feature fusion layer was incorporated as a skip connection to optimize feature fusion, thereby providing robust technical support for image segmentation task. The PDE plant temporal image contrastive learning method pre-training required the construction of image pairs corresponding to different phenological periods. A classification distance loss function, which incorporated prior knowledge, was employed to construct an Encoder with adjusted parameters. Pre-trained weights obtained from the PDE plant temporal image contrastive learning method were effectively transferred and and applied to the semantic segmentation task, enabling the network to accurately learn semantic information and detailed textures of various sweet cherry organs. The Encoder module performs multi-scale feature extraction by convolutional and pooling layers. This process enabled the hierarchical processing of the semantic information embedded in the input image to construct representations that progress transitions from low-level texture features to high-level semantic features. This allows consistent extraction of semantic features from images across various scales and abstraction of underlying information, enhancing feature discriminability and optimizing modeling of complex targets. The Decoder module was employed to conduct up sampling operations, which facilitated the integration of features from diverse scales and the restoration of the original image resolution. This enabled the results to effectively reconstruct spatial details and significantly improve the efficiency of model optimization. At the interface between the Encoder and Decoder modules, a GCN layer designed for local feature fusion was strategically integrated as a skip connection, enabling the network to better capture and learn the local features in multi-scale images. [Results and Discussions] Utilizing a set of evaluation metrics including accuracy, precision, recall, and F₁-Score, an in-depth and rigorous assessment of the model's performance capabilities was conducted. The research findings revealed that the improved U-Net model achieved superior performance in semantic segmentation of sweet cherry images, with an accuracy of up to 0.955 0. Ablation experiments results further revealed that the proposed method attained a precision of 0.932 8, a recall of 0.927 4, and an F₁-Score of 0.912 8. The accuracy of improved U-Net is higher by 0.069 9, 0.028 8, and 0.042 compared to the original U-Net, U-Net with PDE plant temporal image contrastive learning method, and U-Net with GCN skip connections, respectively. Meanwhile the F₁-Score is 0.078 3, 0.033 8, and 0.043 8 higher respectively. In comparative experiments against DeepLabV3, Swin Transformer and Segment Anything Model segmentation methods, the proposed model surpassed the above models by 0.022 2, 0.027 6 and 0.042 2 in accuracy; 0.063 7, 0.147 1 and 0.107 7 in precision; 0.035 2, 0.065 4 and 0.050 8 in recall; and 0.076 8, 0.127 5 and 0.103 4 in F₁-Score. [Conclusions] The incorporation of the PDE plant temporal image contrastive learning method and the GCN techniques was utilized to develop an advanced U-Net architecture that is specifically designed and optimized for the analysis of sweat cherry plant phenotyping. The results demonstrate that the proposed method is capable of effectively addressing the issues of boundary blurring and detail loss associated with small targets in complex orchard scenarios. It enables the precise segmentation of the primary organs and background regions in sweet cherry images, thereby enhancing the segmentation accuracy of the original model. This improvement provides a solid foundation for subsequent crop modeling research and holds significant practical importance for the advancement of agricultural intelligence.

Select

Technology and Method

Dense Nursery Stock Detecting and Counting Based on UAV Aerial Images and Improved LSC-CNN

PENGXiaodan, CHENFengjun, ZHUXueyan, CAIJiawei, GUMengmeng

Smart Agriculture. 2024, 6(5): 88-97. https://doi.org/10.12133/j.smartag.SA202404011

Abstract (22) PDF (10) HTML (10)

Knowledge map

Save

[Objective] The number, location, and crown spread of nursery stock are important foundations data for their scientific management. Traditional approach of conducting nursery stock inventories through on-site individual plant surveys is labor-intensive and time-consuming. Low-cost and convenient unmanned aerial vehicles (UAVs) for on-site collection of nursery stock data are beginning to be utilized, and the statistical analysis of nursery stock information through technical means such as image processing achieved. During the data collection process, as the flight altitude of the UAV increases, the number of trees in a single image also increases. Although the anchor box can cover more information about the trees, the cost of annotation is enormous in the case of a large number of densely populated tree images. To tackle the challenges of tree adhesion and scale variance in images captured by UAVs over nursery stock, and to reduce the annotation costs, using point-labeled data as supervisory signals, an improved dense detection and counting model was proposed to accurately obtain the location, size, and quantity of the targets. [Method] To enhance the diversity of nursery stock samples, the spruce dataset, the Yosemite, and the KCL-London publicly available tree datasets were selected to construct a dense nursery stock dataset. A total of 1 520 nursery stock images were acquired and divided into training and testing sets at a ratio of 7:3. To enhance the model's adaptability to tree data of different scales and variations in lighting, data augmentation methods such as adjusting the contrast and resizing the images were applied to the images in the training set. After enhancement, the training set consists of 3 192 images, and the testing set contains 456 images. Considering the large number of trees contained in each image, to reduce the cost of annotation, the method of selecting the center point of the trees was used for labeling. The LSC-CNN model was selected as the base model. This model can detect the quantity, location, and size of trees through point-supervised training, thereby obtaining more information about the trees. The LSC-CNN model was made improved to address issues of missed detections and false positives that occurred during the testing process. Firstly, to address the issue of missed detections caused by severe adhesion of densely packed trees, the last convolutional layer of the feature extraction network was replaced with dilated convolution. This change enlarges the receptive field of the convolutional kernel on the input while preserving the detailed features of the trees. So the model is better able to capture a broader range of contextual information, thereby enhancing the model's understanding of the overall scene. Secondly, the convolutional block attention module (CBAM) attention mechanism was introduced at the beginning of each scale branch. This allowed the model to focus on the key features of trees at different scales and spatial locations, thereby improving the model's sensitivity to multi-scale information. Finally, the model was trained using label smooth cross-entropy loss function and grid winner-takes-all strategy, emphasizing regions with highest losses to boost tree feature recognition. [Results and Discussions] The mean counting accuracy (MCA), mean absolute error (MAE), and root mean square error (RMSE) were adopted as evaluation metrics. Ablation studies and comparative experiments were designed to demonstrate the performance of the improved LSC-CNN model. The ablation experiment proved that the improved LSC-CNN model could effectively resolve the issues of missed detections and false positives in the LSC-CNN model, which were caused by the density and large-scale variations present in the nursery stock dataset. IntegrateNet, PSGCNet, CANet, CSRNet, CLTR and LSC-CNN models were chosen as comparative models. The improved LSC-CNN model achieved MCA, MAE, and RMSE of 91.23%, 14.24, and 22.22, respectively, got an increase in MCA by 6.67%, 2.33%, 6.81%, 5.31%, 2.09% and 2.34%, respectively; a reduction in MAE by 21.19, 11.54, 18.92, 13.28, 11.30 and 10.26, respectively; and a decrease in RMSE by 28.22, 28.63, 26.63, 14.18, 24.38 and 12.15, respectively, compared to the IntegrateNet, PSGCNet, CANet, CSRNet, CLTR and LSC-CNN models. These results indicate that the improved LSC-CNN model achieves high counting accuracy and exhibits strong generalization ability. [Conclusions] The improved LSC-CNN model integrated the advantages of point supervision learning from density estimation methods and the generation of target bounding boxes from detection methods.These improvements demonstrate the enhanced performance of the improved LSC-CNN model in terms of accuracy, precision, and reliability in detecting and counting trees. This study could hold practical reference value for the statistical work of other types of nursery stock.

About journal

About journal

Editorial board

Indexed-in

Browse

Accepted

Current Issue

Archive

Most Download

Most Read

Most Cited

Browse by Column

Special Issue

Virtual Issues

Author Center

Submission

Guidelines

Manuscript Writing

Copyright Agreement

Template

Charges

Subscription

Contact us

中文

Please choose a citation manager

Content to export

模态框（Modal）标题

About journal

Browse

Author Center

Please choose a citation manager

Content to export