Blog

  • The Hundred 2025 results: Southern Brave reach women’s final with win over London Spirit; Northern Superchargers end Oval Invincibles’ hopes

    The Hundred 2025 results: Southern Brave reach women’s final with win over London Spirit; Northern Superchargers end Oval Invincibles’ hopes

    Unbeaten Southern Brave secured a place in the women’s final of The Hundred with an eight-wicket win over London Spirit at Lord’s.

    Chasing 126, Maia Bouchier, who hit 43, and Laura Wolvaardt, with an unbeaten 56, put on a second-wicket partnership of 95, as Brave defeated the defending champions with six balls to spare.

    Qualification marks a return to form for the Southampton-based side, who finished bottom last year after reaching the tournament’s first three finals, winning in 2023.

    The visitors started well after winning the toss and choosing to bowl, reducing Spirit to 36-3, with spinner Mady Villiers, who claimed 3-38 overall, dismissing Georgia Redmayne and Grace Harris in the space of four balls.

    Cordelia Griffith (44) and Charli Knott (34) put on a partnership of 67 for the fourth wicket, but the former’s dismissal by Georgia Adams started a run of five Spirit wickets falling for 22 runs.

    Brave lost Danni Wyatt-Hodge for nine, given out lbw to Issy Wong on review, but Bouchier and Wolvaardt calmly set about chasing down their target.

    Wong had Bouchier caught by Eva Gray with 18 runs still required, but Sophie Devine came in and struck the winning runs to secure a seventh successive victory, equalling their own record winning streak set during their 2023 triumph.

    Brave have one group game remaining, at home to bottom side Welsh Fire on Thursday, and will play against the winners of the Eliminator match between second and third in the final at Lord’s on Sunday, 31 August.

    Spirit remain in fourth and will need to beat Oval Invincibles in their last match on Monday and hope other results go in their favour to make the Eliminator.

    Continue Reading

  • Sophie Turner worried about future of new ‘Harry Potter’ cast

    Sophie Turner worried about future of new ‘Harry Potter’ cast

    Sophie Turner on negative impact of social media in early career

    Sophie Turner is worried about the mental health of the Harry Potter’s new cast members, Dominic McLaughlin, Arabella Stanton, and Alastair Stout, due to the negative side of social media.

    During an interview with Flaunt magazine, the Hollywood actress candidly discussed how social media toxicity has impacted the mental health of young age celebrities.

    “I look at the kids who are about to be in the new Harry Potter and I just want to give them a hug and say, ‘Look, it’s going to be okay but don’t go anywhere near (social media),” she told the outlet.

    She continued, “Stay friends with your home friends, keep living at home with your family, make sure your parents are your chaperones – it’s so important to have that grounding adjacent to the big, crazy stuff that you do.”

    Recalling when Turner started her career at the age of 13 from the role of Sansa Stark in Game of Thrones, she said, “I think social media was just really becoming a big thing after I started on Game of Thrones.”

    “I got a couple of years of peace and quiet and then I had to adjust. It had such a profound impact on my mental health, like more than I could tell you. It almost destroyed me on numerous occasions,” she explained, revealing the impact of early fame.

    Sharing her plans to launch her kids’ acting career, the Dark Phoenix actress concluded, “Oh God, they’re not acting! Not until they’re at least 25!”

    For those unversed, Sophie Turner shares two daughters, Willa and Delphine, with her ex-husband Joe Jonas, with whom she married from 2019 to 2023.


    Continue Reading

  • Dar stresses stronger Pak-Bangladesh youth engagement – RADIO PAKISTAN

    1. Dar stresses stronger Pak-Bangladesh youth engagement  RADIO PAKISTAN
    2. Ishaq Dar first foreign minister in 13 years to officially visit Dhaka  Dawn
    3. Pakistan deputy PM in Bangladesh for first high-level visit in years  Arab News
    4. Bangladesh cancels visa requirements for Pakistani officials for first time since 1971  TRT Global
    5. DPM departs on two-day official visit to Bangladesh  The Express Tribune

    Continue Reading

  • Confirmed team news: Ouattara starts for Brentford against Aston Villa in Premier League clash | Brentford FC

    Confirmed team news: Ouattara starts for Brentford against Aston Villa in Premier League clash | Brentford FC

    Dango Ouattara will make his Brentford debut in the Bees’ Premier League home game against Aston Villa.

    The forward, who signed for the club last Saturday, has been named in Keith Andrews’ starting XI.

    He is one of four changes to the side at Gtech Community Stadium.

    Jordan Henderson will make his full debut for the club while Kevin Schade, who also featured as a substitute against Nottingham Forest at the City Ground, starts.

    Mikkel Damsgaard missed the season opener following the birth of his and his wife’s first child, but returns to the team for the visit of the Villans.

    The quartet they replace – Rico Henry, Mathias Jensen, Antoni Milambo and Fábio Carvalho – are all named on the bench, with Keane Lewis-Potter moving to left-back.

    On the absence of Yoane Wissa, Andrews said: “It’s not right to involve him in the squad. The team is the focus and I felt that was the right thing to do.”

    Brentford: Kelleher; Kayode, Collins, van den Berg, Lewis-Potter; Henderson, Yarmoliuk, Damsgaard; Ouattara, Thiago, Schade

    Subs: Valdimarsson, Ajer, Henry, Hickey, Jensen, Milambo, Onyeka, Carvalho, Peart-Harris

    Aston Villa: Martínez; Cash, Mings, Pau, Digne; Onana, Kamara, Tielemans; Rogers, Watkins, McGinn

    Subs: Bizot, Proctor, Buendía, Malen, Maatsen, Bogarde, Guessand, Rowe, Burrowes

    Continue Reading

  • Data covering soil management practices and farm characteristics on Swiss arable farms

    Data covering soil management practices and farm characteristics on Swiss arable farms

    Sampling procedure

    In the context of the Horizon Europe Project “InBestSoil”, the data collection focused on arable management practices in Switzerland. Specifically, those practices related to soil health and soil conservation undertaken within the 2022/2023 production season. Farm selection for the survey was based on specific criteria to ensure that the data collection accurately represented arable agricultural practices in Switzerland. These criteria were designed to target farms that were significantly involved in arable agriculture, which is crucial for assessing arable soil health management practices. Eligible farms were required to meet the following criteria:

    • Grow wheat in the preceding season (2021/2022).

    • Farm at least 3 hectares of arable land in the preceding season (2021/2022).

    • Arable land must have comprised at least 20% of the total farmed area in the preceding season (2021/2022).

    We entered a data sharing agreement with the Federal Office of Agriculture to enable our survey campaign via access to contact information of all farmers who met the above selection criterion (see the supplementary material in the data repository for a copy of this contract)1. The Federal Office of Agriculture implemented our selection criterion on the agricultural data that they collect on a yearly basis from the direct payment applications of all Swiss farmers. Note, at the time of our application to the Federal Office of Agriculture, data for the production season 2022/2023 was not available. This is why we use data from the preceding production season for specifying the selection criteria, as this was the latest data available at the time, from which the Federal Office of Agriculture could make an assessment of which farm contact details to share with us for the survey.

    In August 2023, we received the contact details of 15,023 farmers who qualified for the survey from the Federal Office of Agriculture’s records. The information we received included the email address, farm identification number, language spoken, name and form of address. However, as per our data sharing agreement with the Federal Office of Agriculture, this data was allowed exclusively for our use in this project and cannot be shared with any outside partner not party to the aforementioned data sharing contract. The contact data of farmers that was received from the Federal Office of Agriculture will be kept for the duration of the InBestSoil project and stored securely on private institutional servers in encrypted files. All contact information will be deleted at the conclusion of the project (December 2026) and all data presented herewith is strictly anonymised to protect the data and identities of the farmers who took part in the survey. Moreover, we have taken measures to prevent any farmers from being identified via their answers (for example variables such as manager age, wheat areas grown, location etc. have been classified into more homogenous categorical groups), which means that the data we present here is slightly different to the data that we have available for our own analyses, as agreed under the data sharing agreement with the Federal Office of Agriculture.

    Survey design and content

    While adoption of agricultural practices certainly varies with farm characteristics such as size, labour availability, or participation in agri-environmental schemes, these factors alone are not sufficient to explain farmer behaviour. There is no single set of drivers that consistently predicts adoption across studies or regions43. Instead, adoption depends strongly on local contexts, and the interplay of economic, social, and psychological factors44. To capture the complexity of adoption behaviour, the survey included questions on farmers’ priorities, perceptions, self-assessed competencies, and personal goals, as well as their exposure to peer practices, participation in training and advisory services, and sources of information. These dimensions are important because farmers do not make decisions in isolation; their attitudes towards risk, innovation and environmental values can influence their decisions alongside financial considerations. Such data contribute to a more thorough understanding of the multifaceted factors influencing soil health-related decisions. The inclusion of these variables also offer valuable insights into the barriers and drivers of sustainable soil management, essential for shaping targeted and effective agricultural policies and support programs.

    The full survey is available within the data repository in French, German and English1. The final survey was developed over the course of a year, including revisions resulting from three rounds of consultation with external stakeholders, internal consultation and testing with farmers. All participants in the survey were asked to give their informed consent by ticking a box in the online questionnaire, confirming their agreement to participate in the study. Additionally, participants consented to the linking of secondary geographical data with their responses, which was also confirmed by ticking a separate checkbox in the survey. Once the participants had agreed to these, the survey was administered uniformly following the structure outlined below. All questions appeared in the same order and, only if certain exclusion criteria were met – such as when their previous answer ruled out any further sub-questions – were some sub-questions hidden from the view of participants. Inclusive of all sub-questions, the survey contained 57 questions, and answering the questionnaire took farmers a median time of 23 minutes.

    The survey design was based on previously implemented surveys regarding agricultural production practices in Switzerland45,46,47,48,49. Specifically, questions on farm information and participation in soil-related programmes were included to assess farmers’ engagement with policy incentives and voluntary schemes. The inclusion of personal characteristics aimed to understand demographic drivers of management behaviour. The questions on management practices were developed in close collaboration with experts from the soil science and agricultural extension fields, and were cross-checked with relevant literature. Data on milling wheat production and related input use were collected to link agronomic decisions with productivity outcomes. Information on structural farm characteristics, such as farm type, location, and land tenure, provides context for understanding the decision-making environment and potential constraints faced by farmers. Finally, a strong focus was placed on behavioural and attitudinal factors, including information sourcing, perceived risks, and personal goals, to account for the cognitive and motivational dimensions of farmer behaviour. The following section provides an overview of the variables investigated within each of these question groups. The collected data are documented in the accompanying datasets1. Each question group corresponds to a clearly defined set of columns.

    Demographic details (Primary dataset columns B-H)

    Age, duration farm responsibility, gender, full time equivalent and whether the farm succession is already secured.

    Participation in soil health programmes (Primary dataset columns H-S)

    Organic farming support, soil cover scheme, reduced tillage scheme, herbicide-free farming scheme, pesticide-free farming scheme, efficient fertiliser use, wider row planting, beneficial insect strip, precision application, cantonal soil health support, cantonal input reduction support, cantonal investment and equipment support.

    Management Practices (Primary dataset columns T-CA)

    An overview of all management practices addressed in the survey, including their descriptions and the typical machinery used, is provided in Table 1. Farmers were asked about their knowledge about the practices, the application as well as the frequency of application within the last 10 years and whether they know other farmers that use the practice. The practices covered by our survey were selected based on the input of soil scientists and agricultural extension workers based in Switzerland.

    Table 1 Overview of management practices included in the survey through which the presented dataset was collected, with descriptions and typical machinery used for each practice listed.

    Milling Wheat Production (Wheat dataset columns B-M)

    Production standard, hectares of milling wheat grown, yield milling wheat, yield milling wheat over last five seasons, quantity synthetic fertiliser, quantity organic fertiliser, sowing density, number of biostimulant treatments, number of herbicide treatments, number of fungicide treatments, number of insecticide treatments and number of plant growth regulator treatments.

    Structural Farm Characteristics (Primary dataset columns CD-CP)

    Family members employed, farm focus (arable, livestock, permanent crop, others), full time or part-time farm, percentage of rented land, whether the soil has been assessed and a soil management plan exists.

    Training and Advice (Primary dataset columns CQ-CZ)

    Advice agricultural adviser, advice agricultural retailer, advice cantonal or national institution, consult other farmers, consult social media channels, consult publications or webpages, participation equipment demonstration, participation farmer discussion or training group, participation farm demonstration, participation course.

    Behavioural and Attitudinal Factors (Primary dataset columns DA-EK)

    Respondents’ self-assessment of their perceived influence of the weather on crop production and ambitiousness of self-set production goals.

    Respondents’ self-assessment of their willingness to take risks in the domains of; agricultural production, investment in agricultural technology and crop protection.

    Respondents’ self-assessment of their confidence in being able to; find solutions to arable production challenges and achieve production goals by harvest end.

    The respondents self-reported importance of the following aspects in decision making;

    Maximising yields, minimising input costs, minimising time or labour requirements, minimising production risks, minimising farm exposure to weeds or pests or diseases, adapting to weather patterns, adapting to farmland conditions, improving soil health or structure or fertility, improving biodiversity, minimising environmental impact, expanding farm land, adapting to crop market developments, adapting to changes in direct payment rates or regulations, seeking professional agronomic, seeking casual advice from friends or colleagues and seeking peer approval.

    Ethical approval and pre-registration

    The survey campaign and research design were both approved separately by the ETH Zürich Ethics Commission as proposal 2023-N-212 as well as the FiBL Ethics Committee as proposal FSS-2023-006. Copies of the approval letters are included in the supplementary material1. Before launching our survey, we also submitted two research plans for pre-registration of hypotheses via the online platform AsPredicted operated by the University of Pennsylvania (link: AsPredicted). For further information on these, see AsPredicted #153145 and AsPredicted #153146 that were registered on 29th November 2023.

    Survey implementation

    The survey was implemented as an online survey formulated with Lime Survey and distributed via email. All eligible farms received an individualised email addressed personally to the recipient and a survey link, connected with a unique token to enable us to link the farmer responses with secondary data available for each farm. The participants were asked to give their permission for this by approving the terms and conditions we made available to them regarding how their data would be handled. By agreeing to the disclosure agreement, the farmers gave their permission for the anonymised data, that they subsequently provide through the survey, to be used exclusively for science and research purposes. Farmers were also given the option to opt out of the survey at any time, with no explanation needed. To incentivise participation in the survey we offered the opportunity to enrol in a lottery of 100 supermarket vouchers worth CHF 150 each and the option to receive a personalised results report comparing the farmers’ answers to the answers of other similar farms. The individualised reports were administered via a bespoke app created using R-Shiny (see technical validation section below for further details).

    Prior to the full survey launch, a pilot survey was conducted on a random sample of 1% of eligible farms (150 farms) to test the survey’s functionality and to refine any issues. The pilot survey launched on 30th November 2023, and the full survey went live six days later, on 6th December 2023. The survey was closed on 31st January 2024, after a response period of nearly two months.

    Data cleaning

    To minimise errors already at the point of data entry, the survey was designed to allow only predefined values or plausible numeric ranges for most variables. Wherever this was not technically feasible, such as in open-text fields or free numeric input, we conducted systematic data cleaning after data collection. Data cleaning involved addressing inconsistencies and missing values. In cases where values were deemed implausible or outliers, they were either removed or corrected if sufficient data from other columns was available. This cleaning procedure was applied to variables related to plant protection product treatments, yield, sowing density, labour input, and demographic information. We include the following to illustrate the approach we took as an example (note all processing codes are available in the supplementary material which outline these decisions on a line-by-line basis):

    If in the labour units column, an entry was listed as 48, which was inconsistent with the farm area, this value was corrected to 4.8 using a related column for recalculation. Similarly, we proceeded for the variable age: if a data entry was obviously wrong, such as a year of birth recorded as 60 instead of the demanded format YYYY (1960), and the farmer had entered the column of farming experience 40 years, the value was corrected to ‘1960’ based on logical inference. If no reliable correction could be made, the value was marked as ‘NA’ (Not Available).

    To ensure anonymity, apart from removing precise geographical information we also grouped continuous variables such as age and farming experience into categories (e.g. age_group and years_experience_group). The data was anonymised, and no specific details were included that could link individual responses to specific farms. No randomisation was applied to the data. With regard to the secondary data, we also took measures to prevent identification by rounding the variables to the nearest integer (the codes for the processing of this data are also available in the supplementary material).

    Continue Reading

  • Scientists discover forgotten particle that could unlock quantum computers

    Scientists discover forgotten particle that could unlock quantum computers

    Quantum computers have the potential to solve problems far beyond the reach of today’s fastest supercomputers. But today’s machines are notoriously fragile. The quantum bits, or “qubits,” that store and process information are easily disrupted by their environment, leading to errors that quickly accumulate.

    One of the most promising approaches to overcoming this challenge is topological quantum computing, which aims to protect quantum information by encoding it in the geometric properties of exotic particles called anyons. These particles, predicted to exist in certain two-dimensional materials, are expected to be far more resistant to noise and interference than conventional qubits.

    “Among the leading candidates for building such a computer are Ising anyons, which are already being intensely investigated in condensed matter labs due to their potential realization in exotic systems like the fractional quantum Hall state and topological superconductors,” said Aaron Lauda, professor of mathematics, physics and astronomy at the USC Dornsife College of Letters, Arts and Sciences and the study’s senior author. “On their own, Ising anyons can’t perform all the operations needed for a general-purpose quantum computer. The computations they support rely on ‘braiding,’ physically moving anyons around one another to carry out quantum logic. For Ising anyons, this braiding only enables a limited set of operations known as Clifford gates, which fall short of the full power required for universal quantum computing.”

    But in a new study published in Nature Communications, a team of mathematicians and physicists led by USC researchers has demonstrated a surprising workaround. By adding a single new type of anyon, which was previously discarded in traditional approaches to topological quantum computation, the team shows that Ising anyons can be made universal, capable of performing any quantum computation through braiding alone. The team dubbed these rescued particles “neglectons,” a name that reflects both their overlooked status and their newfound importance. This new anyon emerges naturally from a broader mathematical framework and provides exactly the missing ingredient needed to complete the computational toolkit.

    From mathematical trash to quantum treasure

    The key lies in a new class of mathematical theories called non-semisimple topological quantum field theories (TQFTs). These extend the standard “semisimple” frameworks that physicists typically use to describe anyons. Traditional models simplify the underlying math by discarding objects with so-called “quantum trace zero,” effectively declaring them useless.

    “But those discarded objects turn out to be the missing piece,” Lauda explained. “It’s like finding treasure in what everyone else thought was mathematical garbage.”

    The new framework retains these neglected components and reveals a new type of anyon — the neglecton — which, when combined with Ising anyons, allows for universal computation using braiding alone. Crucially, only one neglecton is needed, and it remains stationary while the computation is performed by braiding Ising anyons around it.

    A house with unstable rooms

    The discovery wasn’t without its mathematical challenges. The non-semisimple framework introduces irregularities that violate unitarity, a fundamental principle ensuring that quantum mechanics preserve probability. Most physicists would have seen this as a fatal flaw.

    But Lauda’s team found an elegant workaround. They designed their quantum encoding to isolate these mathematical irregularities away from the actual computation. “Think of it like designing a quantum computer in a house with some unstable rooms,” Lauda explained. “Instead of fixing every room, you ensure all of your computing happens in the structurally sound areas while keeping the problematic spaces off-limits.”

    “We’ve effectively quarantined the strange parts of the theory,” Lauda said. “By carefully designing where the quantum information lives, we make sure it stays in the parts of the theory that behave properly, so the computation works even if the global structure is mathematically unusual.”

    From pure math to quantum reality

    The breakthrough illustrates how abstract mathematics can solve concrete engineering problems in unexpected ways.

    “By embracing mathematical structures that were previously considered useless, we unlocked a whole new chapter for quantum information science,” Lauda said.

    The research opens new directions both in theory and in practice. Mathematically, the team is working to extend their framework to other parameter values and to clarify the role of unitarity in non-semisimple TQFTs. On the experimental side, they aim to identify specific material platforms where the stationary neglecton could arise and to develop protocols that translate their braiding-based approach into realizable quantum operations.

    “What’s particularly exciting is that this work moves us closer to universal quantum computing with particles we already know how to create,” Lauda said. “The math gives a clear target: If experimentalists can find a way to realize this extra stationary anyon, it could unlock the full power of Ising-based systems.”

    In addition to Lauda, other authors include the study’s first author, Filippo Iulianelli, and Sung Kim of USC, and Joshua Sussan of Medgar Evers College of The City University of New York.

    The study was supported by National Science Foundation (NSF) Grants (DMS-1902092, DMS-2200419, DMS-2401375), Army Research Office (W911NF-20-1-0075), Simons Foundation Collaboration Grant on New Structures in Low-Dimensional Topology, Simons Foundation Travel Support Grant, NSF Graduate Research Fellowship (DGE- 1842487) and PSC CUNY Enhanced Award (66685-00 54).

    Continue Reading

  • Discovery of CRISPR-Cas12a clades using a large language model

    Discovery of CRISPR-Cas12a clades using a large language model

    Development of an Artificial Intelligence-assisted CRISPR-Cas Scan (AIL-Scan) strategy based on an ESM large language model

    We assumed that by embedding the functional feature with protein primary sequences, we could trace the natural evolution rules and identify the CRISPR-Cas proteins in the metagenomics data directly without sequence alignments. To identify the CRISPR-Cas proteins, we developed an Artificial Intelligence-assisted CRISPR-Cas Scan (AIL-Scan) strategy (Fig. 1a). It includes the following steps:

    1. 1.

      CRISPR-Cas training data is created by extracting CRISPR-associated (Cas) proteins from the NCBI database, classifying them by genes, and removing redundant sequences.

    2. 2.

      Supervised fine-tuning of ESM on the CRISPR-Cas training data based on the biological information to predict the Cas protein.

    3. 3.

      Feature analyses of Cas proteins, including cleavage activity, CRISPR-loci type, CRISPR loci-length, direct repeats, spacers, evolutionary analyses, MSA, and structures.

    Fig. 1: Artificial Intelligence-assisted CRISPR-Cas Scan (AIL-Scan).

    a The ESM language model is trained by Cas proteins, which were collected, classified, and clustered as input sequences. The Cas proteins were embedded and classified with multiple labels. The trans-cleavage activity prediction model was developed based on the ESM and small-scale experimental data of trans-cleavage. The trained model was applied to discover Cas proteins and predict features from the sequences extracted from the metagenome. The protein structures were visualized using Chimera59. The sequence alignment was visualized by Jalview61. b The receiver operating characteristic (ROC) curves and area under the ROC curve (AUC) for 12 Cas proteins and non-Cas proteins. c The test loss and test accuracy curves of AIL-Scan.

    We generated our training data using reviewed NCBI gene data. We annotated the Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, Cas10, Cas12, and Cas13. Non-Cas proteins were extracted according to the following rules, without the annotation of Cas, and removing the proteins with sequence similarity over 40%. The Cas protein database was separated into a training or validation database using CD-HIT-2D with a 40% identity threshold to remove the redundant sequences and avoid overfitting. We collected 76567 non-redundant positive sequences and 13047 non-Cas proteins, which were deposited in NCBI before July 5, 2023 (Supplementary Fig. 1). The maximal protein length is less than 1764 amino acids. To obtain the best classification, we introduced the “focal loss” in the classification to solve the unbalance of the input data. We obtained the best model during the 13th Epoch of model training and obtained 97.75% accuracy for the ESM 2 model with 650 million (650 M) parameters (Supplementary Fig. 2). Using the 15 billion (15B) parameters model, we achieved the best performance in the 9th Epoch with 98.22% accuracy (Supplementary Fig. 2). This model maintained consistent performance, achieving an accuracy 97.68% on the independent dataset, i.e. TestSet2024, which contains sequences deposited in NCBI from July 6, 2023, to Oct 28, 2024 (Supplementary Tables 1–3). These results indicate a robust generalization of this model. The accuracy and prediction speed of AIL-Scan is comparable to the CRISPRcasIdentifier, which integrates HMMs and machine learning (Table 1 and Supplementary Fig. 3). CASPredict performed with the highest speed among the four software, although its accuracy is lower than the machine learning based software, i.e., AIL-Scan and CRISPRcasIdentifier. However, the NCBI data has been partially annotated by the HMM model, so we turned to validate AIL-Scan’s capability in recognizing “unseen proteins”. We utilized a recent dataset of 3601 Cas12 family protein sequences20, in which 3521 sequences (97.8%) had less than 90% similarity with the training set, meanwhile 3351 sequences (93.1%) had less than 40% similarity with the training set. This test set is named TestSet2025 and is significantly distinct from the training set in sequence space, making it suitable for evaluating generalization ability. AIL-Scan successfully identified 3182 Cas12 proteins, in contrast, the HMM model identified 1240 sequences, demonstrating the strong generalization capabilities of AIL-Scan. Considering the resource consumption, the 650M model is sufficient for the Cas prediction. We used ESM embeddings to reduce dimensionality with t-SNE for 77684 sequences and discovered that ESM can distinguish the differences in various Cas classifications. The ROC curves and AUC indicate the probability that the positive sample’s decision value is greater than the negative sample’s decision value for all the Cas and non-Cas proteins (Fig. 1b). The test loss and test accuracy also indicate that the model generalizes correctly and performs well on unseen data (Fig. 1c). We evaluated the model robustness using the 5-fold cross-validation. The average accuracy is 0.9786 and the standard deviation is 0.0013 (Supplementary Table 4).

    Table 1 Cas protein prediction accuracy using different models

    We use the Global Microbial Gene Catalog (GMGC) metagenomic database for the Cas protein discovery21. We selected 50,000 bins with high quality from GMGC and extracted 20,000 MAGs, including CRISPR-loci, to test the performance of AIL-Scan. The protein sequences were predicted by Prodigal software22. We collected ca. 20,000,000 protein sequences shorter than 1500 amino acids for prediction. In comparison with the established methods, the AIL-Scan predicts 1379 Cas12a sequences.

    Development of a trans-cleavage activity prediction model

    The trans-cleavage activity of Cas12a has been used in various applications. Although many CRISPR-Cas12a proteins have been identified, few of them have been tested in the trans-cleavage experiments. Therefore, the main challenge encountered during this study lies in dealing with a small sample size coupled with high-dimensional embeddings, which often leads to convergence issues when employing most models. A total of 69 labeled Cas12a proteins (including three known Cas12a) were included in our analysis (Supplementary Data 1). Their trans-cleavage activities were assessed by the fluorophore-quencher (FQ) reporter assay. The trans-cleavage activity was defined as proteins displaying fluorescence intensity twice that of the negative control. Thirty-three proteins were classified as active in trans-cleavage activity, and the remaining 36 proteins were categorized as inactive. To evaluate the performance of our predictive model, a test set comprising 13 randomly selected proteins (approximately 20% of the sample) was used, while the remaining 56 proteins were employed for training purposes. Initially, we recorded the last embedding layers based on our fine-tuned ESM model for all labeled Cas12a protein sequences. These embeddings (1280 dimensions) were utilized as covariates to predict trans-cleavage activity.

    Different forms of decision tree models are evaluated in this task. The results of our study demonstrate that Light Gradient Boosting Machine (LightGBM) achieves the highest accuracy among mainstream machine learning models, with an accuracy rate of 69.2% on the test set trained on embeddings. To address dimensionality-related challenges, principal component analysis (PCA) was employed to extract essential embeddings, with prediction performance evaluated across 2–15 principal components. Alongside PCA, we compared 31 alternative methods, including t-SNE, UMAP, and raw data. Detailed comparisons, training procedures, and results are provided in Table 2, Supplementary Table 5, and the supplementary notes. LightGBM, CatBoost, and RandomForest achieve the accuracy of 92.3% in the test set (12 out of 13 proteins are correctly labeled) with 4, 6, and 8 principal components, respectively. We can see that compared to training models directly with embeddings, extracting essential dimensions with PCA provides higher accuracies in predicting trans-cleavage activity (Supplementary Table 5). However, this model is still limited by the small dataset, more experimental data would improve its prediction accuracy. Additionally, we tested our prediction model on two unreported Cas12a proteins, i.e., the trans-cleavage activity of two Cas12a candidates: ArCas12a_1 (derived from Agathobacter rectale) and LeCas12a_3 (derived from Lachnospira eligens_B). Our model predicted that ArCas12a_1 has trans-cleavage activity but not LeCas12a_3. In the experiment, ArCas12a_1 demonstrated significantly stronger trans-cleavage activity than the negative control, while LeCas12a_3 did not (Supplementary Fig. 4). These experimental outcomes were consistent with our model’s predictions, supporting the generalizability and robustness of the prediction model.

    Table 2 Cas12a protein trans-cleavage activity prediction accuracy using different strategies

    CRISPR-Cas12a loci predicted from the metagenomics

    We did further feature analyses of Cas12a candidate proteins. Phylogenetic analysis of Cas12 proteins suggests that the identified Cas12a proteins fall into the Cas12a clade (Fig. 2a). The classical CRISPR-loci, comprising essential elements such as Cas1, Cas2, and Cas4, play a pivotal role in type classification. To delve into these features, we employed AIL-Scan to predict Cas1, Cas2, and Cas4 proteins within the same CRISPR loci adjacent to the Cas12a sequence. Subsequently, we meticulously verified 300 predicted CRISPR loci to gain deeper insights manually. Normally, Cas12a is considered to have a unique CRISPR locus, comprising Cas1, Cas2, and Cas4. Intriguingly, the observed count of Cas1, Cas2, and Cas4 proteins was notably lower than that of Cas12a, suggesting the absence of these small Cas proteins in some Cas12a loci (Fig. 2b, c). Further stratification based on the number of integrase proteins led to the classification of CRISPR loci into eight distinct subtypes. The distribution of integrase proteins across these subtypes exhibited a sparse pattern (Fig. 2d). Notably, subtype VIII lacked any integrase proteins, subtype I encompassed Cas1, Cas2, and Cas4, while subtype VI exclusively featured Cas2. This nuanced classification sheds light on the diversity within CRISPR loci and underscores the intricate variations in the composition of integrase proteins among different subtypes. Our observations may provide unreported perspectives on correlations among different CRISPR-Cas systems and integrase proteins. Remarkably, the analyses using the 1000 predicted CRISPR Cas12a loci without manual verification show a strikingly similar distribution pattern as the result from the 300 manually confirmed ones, indicating this distribution is a universal phenomenon (Supplementary Fig. 5). To provide further insights, we measured the length of CRISPR loci, beginning from the start of the Cas12 protein and concluding at the first spacer. Subtype VIII emerged as the shortest, spanning mere 4200 bp, while subtype I is the longest, extending over 6100 bp. Particularly noteworthy were certain subtype I CRISPR loci exhibiting extraordinary lengths of up to 6700 bp, raising the possibility of harboring enigmatic protein elements (Fig. 2e). Aligned with the integrase variation, the numbers of spacers notably decreased in subtypes IV, VI, and VIII, underscoring the pivotal roles of integrases in spacer capture (Fig. 2f). Despite the divergence in spacer numbers, the stem-loop region corresponding to direct repeat sequences remained conserved (Fig. 2g). This consistent conservation hints at a shared structural element, emphasizing the importance of the stem-loop region in CRISPR loci across different subtypes.

    Fig. 2: Cas12a subtypes discovered from metagenomic data.
    figure 2

    a Phylogenetic tree of Cas12 proteins. The identified Cas12a proteins in this work were highlighted in red in the Cas12a family. b Cas12a subtypes with different combinations of accessory proteins, i.e., Cas4, Cas1, and Cas2. c Statistics of Cas12, Cas1, Cas2, and Cas4 from 300 CRISPR-loci, which were verified manually. The features of the first 1000 CRISPR-loci were analyzed in Supplementary Fig. 5. d Statistics of subtypes in the 300 CRISPR-loci. e Sequence length variation in different subtypes. DNA sequence length was calculated from the start codon of the Cas12a gene to the end of the first repeat. f Statistics of spacers in different subtypes. g Sequence alignment of direct repeats in the 300 CRISPR-loci. The sequence corresponding to the stem loop region of crRNA was highlighted with a gray background. h Distribution of Cas proteins in different subtypes and species. The subtypes were colored in the inner circle. The species were labeled in the outer circle. Error bar indicates mean ± s.e.m. measured from three technical replicates. n = 3. Statistical significance was assessed using one-way ANOVA analysis. The symbol ‘#’ indicated that the metagenomes in the corresponding subtypes did not contain spacer sequences. Source data are provided as a Source Data file.

    To explore the distribution of the discovered proteins in the organisms, we constructed a phylogenetic tree using 300 candidate Cas12a proteins, which were manually verified, along with three known Cas12a (LbCas12a, FnCas12a, and AsCas12a). 232 Cas12a proteins from the Lachnospiraceae family cluster into one clade. Within this clade, subclade 1 consisted of 62 subtype I Cas12a proteins, 81 subtype VII Cas12a proteins, and a modest representation of other subtypes. Notably, subtype I and subtype IV emerge as the principal constituents within Subclade 2. Furthermore, Subclade 3 is marked by the exclusive presence of 28 subtype VIII Cas12a proteins originating from the Acutalibacteraceae family. It is worth noting, 94.6% of the identified Cas12a proteins originate from enteric microorganisms (Fig. 2h), which may be due to the ease of recovering high-quality genomes from enteric microorganisms. Additionally, the thermostable YmeCas12a (subtype I) is adjacent to subtype I Cas12a proteins (Supplementary Fig. 6).

    Cas integrases in CRISPR loci

    New insights highlight the structural diversity and functional roles of Cas integrases in CRISPR loci23,24,25,26,27. Cas1, Cas2, and Cas4 are essential for integrating foreign DNA into bacterial CRISPR systems, which generates bacterial immunity26. AlphaFold228 was applied to predict all protein structures in the eight distinct subtypes, providing insights into their variation, respectively (Fig. 3 and Supplementary Fig. 7). Cas1 proteins, encompassing 92–331 amino acids, are classified into eight types based on structure and sequence (Fig. 3a, b and Supplementary Fig. 7b). Type 8 is the most prevalent Cas1 protein, resembling AfCas1 (PDB: 4N06)29 and its N-terminal and C-terminal domains (NTD, CTD) contain with key catalytic sites in specific helices and loops (Supplementary Fig. 7c). Structural differences across types were analyzed via the Dali server30. The variation in CTD elements does not necessarily hinder foreign DNA acquisition31, emphasizing their structural flexibility. Cas2 proteins, containing 70–146 amino acids, also fall into eight subtypes, with type 8 showing notable structural similarities to E. coli Cas2 (PDB: 5DQT)32 but with unique N-terminal helices (Fig. 3c, d and Supplementary Fig. 7d–f). Other subtypes exhibit varied structural deficiencies, such as missing β-sheets or helices, affecting dimer interfaces and potentially altering DNA binding. This diversity underlines Cas2’s adaptability within Cas1–Cas2 complexes (Supplementary Fig. 7f)33. Cas4 proteins, comprising 79–206 amino acids, exhibit eight types (Fig. 3e, f and Supplementary Fig. 7g, h), with type 8 resembling I-C Cas4 (PDB: 8D3Q)24 but lacking specific helices critical for protospacer cleavage. Structural differences across subtypes, such as missing helices or β-sheets, impact spacer insertion and integration within CRISPR systems (Supplementary Fig. 7i). These findings broaden our understanding of Cas4 structural variations and their functional implications in bacterial immunity. The detailed structural features of integrases are analyzed in the Supplementary Note.

    Fig. 3: Structural features of Cas integrase of CRISPR-Cas12 loci.
    figure 3

    a, c, e The RMSD matrix of Cas1, Cas2, and Cas4 structure models constructed by AlphaFold2. Colors within the heatmap, ranging from dark blue to white, represent the RMSD values ranging from high to low. The protein names were colored based on their structure type classification. The color of each protein name corresponds to the protein structure type displayed in the right panel. b, d, f Typical structure models of Cas1, Cas2, and Cas4, which were classified into different types. Secondary structures were annotated for all protein types. Type 1–7 structures of Cas1, Cas2, and Cas4 were superposed onto each full-length type 8 structure, and secondary structures were labeled. The “αX” in type 1 of (f) indicates that it does not appear in other Cas4 structure types.

    Cas12a proteins in the subtypes

    The differences in the Cas12a structures are key features of the Cas12a subtypes. We analyzed the motifs of the Cas12a sequences and discovered conserved and distinct motifs in the different subtypes, which are key for the Cas12a functions (Supplementary Fig. 8). The analysis revealed that the catalytic residues within the RuvC and Nuc domains are highly conserved among all subtypes, reflecting their critical roles in enzymatic function. Specifically, the first catalytic aspartate in the triad resides within the conserved motif IGIFRGEERN. The second catalytic glutamate displays subtype-specific distributions, appearing as MED in subtypes I, IV, V, and VI, as M/LEN/D in subtype II, and as MEK/D in subtype VIII. The third catalytic aspartate is consistently located in the motif DADANG, specifically at the second “D”. Additionally, a highly conserved TSKIDP motif was identified across all subtypes, indicating a shared functional mechanism. Other conserved motifs showed variability among subtypes, suggesting distinct sequence characteristics while maintaining overall catalytic and structural integrity. We also built the structure models of 300 Cas12a proteins using AlphaFold2, except for the failed construction, and calculated the root mean square fluctuation (RMSF) for all candidate Cas12a proteins within one subtype (Supplementary Fig. 9). The detailed analyses are appended in the Supplementary Notes. The RMSF reflects the residue-wise structural difference within one subtype. The results suggested that, despite an overall conserved structural architecture, specific regions within the proteins exhibit variability that may reflect structural adaptations specific to each subtype.

    Cas12a proteins have distinct cis– and trans-cleavage activities

    Cas12a processes the pre-crRNA transcripts into mature crRNA by its endoribonuclease activity. Then the Cas12a–crRNA complex efficiently cis-cleaves a double-stranded DNA (dsDNA), which is initiated by a PAM motif recognition. The cleaved DNA segment that remains bound then induces non-specific degradation of single-strand DNA (ssDNA) (Fig. 4a).

    Fig. 4: Recognition preference of Cas12a variants.
    figure 4

    a Scheme of Cas12a activation, cis-, and trans-cleavage. The Cas12a from different subtypes was labeled with different colors. b Binding of Cas12a with crRNAs investigated by electrophoretic mobility shift assay (EMSA). c Binding of Cas12a with DNAs investigated by EMSA. d Scheme of PAM analyses using a double-strand DNA (dsDNA) array. Normalized PAM heatmaps for EvCas12_2 (e), AmCas11a (f), RspCas12a_2 (g), CAGCas12a (h), and RbrCas12a_1 (i). Each heatmap was normalized from 6 genes, including endogenous genes EMX1, DNMT1, and FANCF, 2 sites from eGFP, and 1 site from MERS virus genes. The individual maps were shown in Supplementary Fig. 12. The DNA sequences were listed in Supplementary Table 8. The weblogs of the PAM sequences for each Cas12a variant are shown below the heatmap. Colors within the heatmap range from dark blue to white, illustrating the normalized intensity of each PAM sequence. Source data are provided as a Source Data file.

    Therefore, we evaluated the RNA binding efficiency, DNA binding efficiency, cis– and trans-acting DNase activities of sixteen Cas12a proteins from eight subtypes derive from Anaeroglobus micronuciformis (AmCas12a), Eubacterium_G ventriosum (EvCas12a_1 and EvCas12a_2), Erysipelatoclostridium sp. (EspCas12a), Ruminococcus_E sp. (RspCas12a_1 and RspCas12a_2), Agathobacter rectale (ArCas12a), Lachnospira eligens (LeCas12a_1 and LeCas12a_2), UBA3388 sp. (UBACas12a), RC9 sp. (RCCas12a), CAG-127 sp. (CAGCas12a), Ruminococcus_E bromii_B (RbrCas12a_1, RbrCas12a_2, RbrCas12a_3 and RbrCas12a_4) (Fig. 4, Supplementary Fig. 10 and Supplementary Table 6). Remarkably, the direct repeat sequence of these candidate Cas12a proteins is conserved alongside their celebrated counterparts, i.e., LbCas12a (Fig. 2g and Supplementary Fig. 11). Therefore, we chose LbCas12a as the positive control in the following assays, as well as its crRNA scaffold in the screening step. All the Cas12a proteins show RNA and DNA binding ability as expected (Fig. 4b, c, Supplementary Fig. 10c, d, and Supplementary Table 7). However, the DNA binding ability of subtype I and subtype VIII are higher than other Cas12a proteins. According to the inherent trans-DNase activity of Cas12a, as well as the 4 bp PAM length, we developed a simple and efficient PAM detection method. We constructed 6 short dsDNA target arrays by annealing 256 kinds of PAM sequence primer pairs in each well, which target EMX1 site1, DNMT1 site1, FANCF site1, MERS site1, eGFP site1, and eGFP site 3 (Supplementary Table 8). Each dsDNA target was incubated with candidate Cas12a proteins, crRNA and FAM-BHQ reporter to detect fluorescence of each reaction system (Fig. 4d). Using this assay, we determined the PAM preference of EvCas12a_2, AmCas12a, RspCas12a_2, CAGCas12a and RbrCAS12a_1, EcCas12_2, RspCas12a_2, and CAGCas12a recognize T rich PAM, but AmCas12a prefer G-start PAM, RbrCas12a_1 recognize 5-GTV-3 PAM (Fig. 4e–i and Supplementary Figs. 11, 12).

    To corroborate the cis-acting DNase activity of candidate Cas12a proteins, we incubated Cas12a proteins with a crRNA and a linearized plasmid dsDNA. All linearized dsDNA were degraded by candidate Cas12a proteins with comparable efficiency to LbCas12a at 37 °C, with the exception of RCCas12a (Fig. 5a and Supplementary Fig 13a). Sanger sequencing of the cleaved DNA ends revealed that AmCas12a introduced INDELs at 18 in NTS and 23 in TS, consistent with other Cas12a orthologs (Supplementary Fig. 13e, f). However, most Cas12a variants exhibited diminished DNase activity, resulting in the production of uncleaved DNA at room temperature (RT), except for subtype VIII Cas12a proteins, which lack integrases. (Fig. 5b and Supplementary Fig. 13b). Subtype II Cas12a variants are slightly less active than LbCas12a in single-strand (ssDNA) degradation, while EspCas12a, EvCas12a_1, EvCas12a_2, and ArCas12a exhibited moderate activity. In contrast, the other Cas12a variants displayed notably lower activity (Fig. 5c and Supplementary Fig. 13c). Most of these Cas12a proteins represent considerable cis cleavage activity but are a bit different in trans-cleavage activity compared to LbCas12a. The ion preference assay reveals that these Cas12a proteins can be activated by Mn2+, similar to the LbCas12a34. Divalent Mg ions prove ineffective in activating the trans ssDNA cleavage activity of low-activity Cas12a variants, and Mn2+ cation emerges as the catalyst for their trans DNase activity. (Fig. 5d and Supplementary Figs. 13d and 14) To investigate the genome-editing ability of candidate Cas12a in eukaryotic cells, we selected 6 target sites with canonical PAM, which can be recognized by all the tested Cas12a (Fig. 5e and Supplementary Table 9). AmCas12a exhibits an average editing efficiency of 49.6% across six sites, with remarkable peaks at sites 3 (85.4%) and 6 (84.9%). In contrast, EvCas12a_2 displays an average editing efficiency of 20.3%, with its highest performance observed at site 1 (25.8%). RspCas12a_2 and RbrCas12a_2, which lack integrase in the loci, yield modest average editing efficiencies of 14.3% and 17.8%, respectively, with notable peaks at site 3 (26.3% and 37.3%, respectively). ArCas12a shows comparable average editing efficiencies with AmCas12a (45.4%), which gets notable peaks at site 3 (75.8%). LeCas12a_1 shows an average editing efficiency of 6.2% and a maximum efficiency of 25.7% at site 2. UBACas12a exhibits nearly negligible editing efficiency, with the highest activity reaching 2.1%. At site 4, CAGCas12a and LeCas12a_2 demonstrate peak genome-editing efficacy, at 81.7% and 73.8%, respectively, with mean editing efficiencies of 28.8% and 26%. AsCpf1 attains an impressive average editing efficiency of 65.5%, with its maximum at site 6 (84.7%). Finally, LbCas12a shows an average editing efficiency of 25.6% and a maximum efficacy of 53.5% at site 6.

    Fig. 5: Cleavage efficiency of Cas12a proteins.
    figure 5

    a, b Cleavage of dsDNA by Cas12a subtypes at 37 °C (a) and 25 °C (b). c Trans-cleavage of ssDNA by Cas12 subtypes using fluorescence-labeled ssDNA reporter. d Divalent cation ions’ preference for the Cas12a variants. Colors within the heatmap, ranging from dark blue to white, indicated the trans-cleavage activity from high to low. Time-course kinetic analyses were analyzed in the Supplementary Fig. 14. e Cellular gene editing efficiency on targeting sites. Two sites were selected from FANCF, EMX1, and DNMT1, respectively. The statistical significance was calculated using the LbCas12a as a reference at each site. The detailed sequences were listed in Supplementary Table 9. Error bar indicates mean ± s.e.m. measured from three technical replicates. n = 3. Statistical significance was assessed using a two-tailed unpaired t-test. Source data are provided as a Source Data file.

    The AmCas12a–crRNA binary complex

    The protein sequence identity of 16 candidate Cas12a proteins to AsCas12a, FnCas12a, and LbCas12a are low, ranging from 30%-46% (Fig.6a and Supplementary Fig. 15). In the three-dimensional structural landscape, Cas12a proteins within the same subclade exhibit a high degree of structural similarity. However, AmCas12a presents a subtle deviation, distinguishing itself somewhat from its subclade I Cas12a counterparts (Fig. 6d, f and Supplementary Fig. 15).

    Fig. 6: Structure of AmCas12a protein.
    figure 6

    a Domain organization of the AmCas12a protein. Detailed protein sequences and alignments were supplemented by Supplementary Fig. 19. The REC1, REC2, PI, WED, BH, RuvC, and Nuc domains were highlighted with distinct colors, respectively. b The cartoon representation of the structure of the AmCas12a–crRNA and schematic of the crRNA used for structural analysis. The nucleotides of crRNA are labeled with numbers. c The structure of AmCas12 revealed by cryoEM. (PDB: 8KGF, EMDB: EMD-37219) The structure alignments comparison with known Cas12a and other variants was analyzed in Supplementary Fig. 17. The structural domains were distinguished according to the color codes at the bottom. d The RMSD matrix of Cas12 structure models constructed by AlphaFold2. Colors within the heatmap from dark blue to white represent the RMSD values from high to low. e Interaction network of crRNA with residues in AmCas12a. The detailed interactions of crRNA seed regions with AmCas1a were shown in Supplementary Fig. 18. f The Alphafold2 structure models of Cas12as, which were used in this paper. g Mismatch analyses of AmCas12a. Error bar indicates mean ± s.e.m. measured from three technical replicates. n = 3. Source data are provided as a Source Data file.

    To understand the molecular details underlying the RNA binding behavior of AmCas12a, we achieved the cryo-EM map of the crRNA binding complex, which consists of AmCas12a and a 44-nt crRNA, at 2.9 Å resolution (Fig. 6b, c, Supplementary Figs. 16 and 17, and Supplementary Table 10). The AmCas12a–crRNA structure maintains a bilobed architecture (Fig. 6c), similar to other Cas12a structures35,36. Nonetheless, it is noteworthy that the AmCas12a–crRNA complex exhibits a distinct conformation when juxtaposed with its counterparts. Specifically, an observable rotational variance is discernible within the REC domain of AmCas12a when compared to the LbCas12a–crRNA and FnCas12a–crRNA complexes. Relative to LbCas12a and FnCas12a, the REC1 domain of AmCas12a presents a deviation of 7.3° and 9.4°, respectively. Simultaneously, the REC2 domain of AmCas12a manifests a rotational disparity of 4.8° and 6.2°, respectively (Supplementary Fig. 17d, e).

    As observed in the LbCas12a and FnCas12a crRNA binary structures, the repeat-derived pseudoknot in the 5’ handle of the crRNA is ordered. However, the crRNA conformation is markedly different from that of the crRNA bound by LbCas12a or FnCas12a. Due to the flexibility of the spacer-derived part of crRNA, it’s almost unclear in the Cas12a–crRNA binary complex35,36. Notably, an extra RNA stem formed by A(1)–A(5) and U(18)–U(22) within the crRNA spacer region makes a part of spacer region including seed sequence well-defined in the central cavity of AmCas12a and adopt an A-form-like helical conformation, but A(−10)–G(−6) and G(6)–A(15) nucleotides of crRNA are unclear (Fig. 6b and Supplementary Fig. 18). To accommodate the double RNA stem substrate, the REC lobe of AmCas12a rotates away from the NUC lobe. Unsurprisingly, the docking of crRNA to Alphafold-generated AmCas12a causes a severe clash in the REC domain (Supplementary Fig. 15c). The attainment of conformational integrity within the extra RNA stem is orchestrated by intricate interplays involving the ribose and phosphate moieties of the crRNA backbone, engaging in multiple interactions with specific residues within the WED, REC1, and RuvC domains of AmCas12a (Fig. 6e). These include residues T19, H751, K522, and H861 from the WED domain, Y50 and R168 from the REC1 domain and Q1003 from the RuvC domain, all of which are conserved with Cas12a orthologs, except Q1003 which form a hydrogen bond with the phosphate of U(18) (Supplementary Fig. 18). Distinct from the FnCas12–crRNA complex, the spacer segment of crRNA major interacts with the WED domain of AmCas12a.

    Compared to the LbCas12a–crRNA complex and FnCas12a–crRNA complex, the divalent Mg ions are in the same location (Supplementary Fig. 17a–c). Consistent with a seed sequence-dependent mechanism of DNA targeting and in broad agreement with previous analyses of AsCas12a, LbCas12a activities in vivo, and FnCas12a activities in vitro35,37,38, cleavage of DNA substrates with single-nucleotide mismatches in the seed segment was almost completely impaired, while mismatches in the PAM-distal region of the DNA target were mostly tolerated (Fig. 6g).

    Specific detection of single-nucleotide mutation by AmCas12a

    Cas12a is a promising tool in the next-generation molecule diagnosis, however, it suffers from the PAM limitation39. The oncogene SNP only has a small sequence window to probe, the traditional PAM, TTTV, could not cover all the SNPs. Therefore, we tested whether the AmCas12a can distinguish the SNPs without a traditional PAM. (Fig. 7a) The oncogene mutants, KRAS c.34 G > T (G12C), did not contain the available TTTV in the adjacent sequences (Fig. 7b). Among the Cas12a proteins that have undergone PAM preference testing, AmCas12a, EvCas12a_2, CAGGCas12a, and RbrCas12a_1 showed potential for recognizing the G12C mutation. The results revealed that AmCas12a exhibited the best performance (Supplementary Fig. 20). We designed the crRNA targeting the SNP (Fig. 7b). According to the fluorescence intensity, we selected the crRNAs inducing the strongest signals, i.e., crRNA 1 for the KRAS mutant (Fig. 7c). The AmCas12a can detect ten copies of the KRAS mutant (Fig. 7d). Furthermore, we diluted the target mutant and evaluated the sensitivity of detection. The AmCas12a can even distinguish 0.1% KRAS mutant in the wild-type gene background, which is more sensitive than the Sanger sequencing (Fig. 7e, f).

    Fig. 7: AmCas12a detection of KRAS mutants.
    figure 7

    a Scheme of single-nucleotide mutant detection by Cas12a. b Synthetic crRNA for single-nucleotide KRAS mutation based on the PAM preference of AmCas12a. The single-nucleotide polymorphism (SNP) site was highlighted in red. c AmCas12a detection of KRAS G12C with various crRNAs and Mn2+. d Detection limit of KRAS mutant using recombinase polymerase amplification (RPA) integrated with Cas12a. The fluorescent images and fluorescence intensity of the 15-min reaction were shown. The copy numbers of the target DNA were shown on the x-axis. e Sensitivity of the AmCas12a detection. KRAS mutant DNA was spiked in the wild type sequences with various ratios, which were shown in the x-axis. f Sanger sequencing results of wild-type KRAS and mutant with different ratios. NC represented the negative control without target DNA. Error bar indicates mean ± s.e.m. measured from three technical replicates. n = 3. Statistical significance was assessed using a two-tailed unpaired t-test. Source data are provided as a Source Data file.

    Continue Reading

  • Serbia announce star-studded roster for FIBA EuroBasket 2025

    Serbia announce star-studded roster for FIBA EuroBasket 2025

    The official EuroBasket app

    BELGRADE (Serbia) – The Serbian national team have officially confirmed their 12-man roster for the upcoming FIBA EuroBasket 2025, headlined by three-time NBA MVP Nikola Jokic.

    Head coach Svetislav Pesic finalized the squad at the end of a perfect 7-0 run in friendly games, positively preparing for the biggest event of the summer.

    SERBIA’S ROSTER FOR FIBA EUROBASKET 2025

    Aleksa Avramovic, Bogdan Bogdanovic, Ognjen Dobric, Marko Guduric, Nikola Jokic, Nikola Jovic, Stefan Jovic, Vanja Marinkovic, Vasilije Micic, Nikola Milutinov, Filip Petrusev, Tristan Vukcevic

    The Eagles have been flying this summer, winning each of the seven friendly contests against Bosnia and Herzegovina, Poland, Greece, Cyprus, Czechia, Germany, and Slovenia. Against Luka Doncic and teammates, they won by 34 points.

    Who is playing at FIBA EuroBasket 2025?

    Roster tracker: Who is playing at FIBA EuroBasket 2025?

    Tracker: Preparation games for FIBA EuroBasket 2025

    Competing as Serbia since 2007, they have finished as EuroBasket runners-up in 2009 and 2017 but are missing the jewel in the crown.

    They were runner-ups at the FIBA Basketball World Cup 2023 and won bronze at the Olympic Games in Paris 2024, almost upsetting Team USA in the Semi-Finals.

    The goal, now, is to come back as European champions to Belgrade – they have plenty of talent and experience on their roster, probably the strongest and most complete among all participating teams.

    Svetislav Pesic’s team will play the Group Phase in Riga, hoping to extend their Latvian trip to the final stages, alongside co-hosts Latvia, Estonia, Portugal, Czechia, and Türkiye.

    They will begin their campaign against Estonia on August 27 at 20:15 CET.

    FIBA

    Continue Reading

  • Venues for 2027 ODI World Cup announced; South Africa to host 44 games. Remaining ten games to be played in…

    Venues for 2027 ODI World Cup announced; South Africa to host 44 games. Remaining ten games to be played in…

    Cricket South Africa (CSA) has confirmed the venues for the 2027 ODI World Cup, which will be played in South Africa, Namibia, and Zimbabwe. The 50-over tournament will be played across the host cities, including Johannesburg, Pretoria, Cape Town, Durban, Gqeberha, Bloemfontein, East London, and Paarl. South Africa is all set to host 44 matches, while the remaining 10 will be played across Namibia and Zimbabwe.

    Cricket South Africa confirm venues for the 50-over World Cup in 2027(AP)

    This announcement came along with the formation of the Local Organising Committee Board (LOCA), spearheaded by former South African cabinet minister Trevor Manuel as Independent Chairman.

    “CSA’s vision is to stage a global, inspiring event which will reflect the face of South Africa—diverse, inclusive, and united. The tournament will be vibrantly different in its style and atmosphere, and its experiences. It will provide players, fans and partners with the most unique, unforgettable experience,” CSA Board Chairperson, Pearl Maphoshe said in an official statement.

    “CSA offers its full support to the appointed LOCB and is confident in their ability to successfully deliver on the mandate set, ensuring a seamless and impactful event,” Maphoshe added.

    It must be mentioned that the 2027 ODI World Cup will be the 14th edition of the tournament. It will be played in October and November 2027.

    Also Read: Rohit Sharma’s return date fixed, to feature in India A matches before ‘last international series’: Report

    This will be the second time that South Africa and Zimbabwe will co-host a tournament after the 2003 edition. Namibia is hosting the competition for the first time.

    Format of the 2027 World Cup

    The 2027 World Cup will have two groups of seven teams, with the top three from each group progressing to the Super Six stage. This format was also used in the 2003 edition.

    South Africa and Zimbabwe have automatically qualified for the tournament since they are the hosts. The top eight teams in the ICC ODI rankings as of 31 March 2027 will also seal their qualification for the tournament.

    Despite Namibia hosting the tournament for the first time, they are not guaranteed a spot since they are not a full member nation of the ICC. The side will have to go through the standard qualification pathway.

    Australia are the defending champions since they won the 2023 edition after beating India in the final at the Narendra Modi Stadium in Ahmedabad.

    Continue Reading

  • Penarth Oasis fan meets band members on way to Toronto gig

    Penarth Oasis fan meets band members on way to Toronto gig

    An Oasis fan who has flown Half the World Away to watch the band play live in Canada on Sunday ended up two members of the band en route to Toronto.

    Rob Murray, 50 from Penarth, Vale of Glamorgan, said the “dream trip” became extra special when he got chatting with lead guitarist, Gem Archer, at a restaurant in Heathrow.

    And he also had a selfie with Oasis co-founder and guitarist Paul “Bonehead” Arthurs after bumping into him at an airport luggage carousel.

    Rob said Archer spoke highly of Oasis’s opening reunion tour gig in Cardiff and how he also said he had enjoyed performing with Noel Gallagher’s High Flying Birds at Cardiff Castle in 2019.

    “It was amazing being in his company for 20 minutes,” said Rob who explained to Archer that he was going to see their gig in Toronto, Ontario, and stay with a friend who had flown over to watch their Wembley gig.

    “I could have talked all things Oasis, High Flying Birds and guitars all day – this is one time I wish my flight had been delayed.”

    Rob said he asked Archer what would happen after the Oasis tour and “if there was new material on the way”.

    “He replied, as things stand, after this tour, ‘it’s back to the day job’ – recording with High Flying Birds.”

    Continue Reading