Training and Organizational Performance in Kenya’s National Government State Departments: A Mixed-Methods Analysis ()
1. Introduction
Training is widely recognised as a key driver of employee capability and organisational performance. In the human resource development literature, training is treated not as a routine administrative activity but as a strategic process through which organisations build knowledge, skills, and competencies that support sustained performance (Aguinis & Kraiger, 2009; Garavan, 2007; Swanson, 2001). This role is especially important in public institutions, where service delivery and policy implementation depend heavily on employee competence, responsiveness, and the consistent execution of work.
Even so, training does not improve performance automatically. Its value depends on whether it is relevant to job demands, well delivered, and reinforced after the training event. Research has shown that employee development contributes to organisational outcomes when it is tied to performance goals and supported by broader development systems (Jacobs & Washington, 2003; Potnuru & Sahoo, 2016; Sung & Choi, 2014). More recent work also points to the importance of transfer, implementation, and institutional support. Ford et al. (2018) argue that training often fails to generate organisational value when acquired knowledge and skills are not transferred to, applied in, and sustained within the workplace, while Nguyen (2019) and McGraw (2014) show that development systems are more effective when they are aligned with organisational priorities and embedded in routine practice.
The wider literature therefore supports a balanced view. Training can improve competence, work quality, innovation, and organisational effectiveness, but its benefits depend on how it is designed and managed (Aguinis & Kraiger, 2009; Collins & Smith, 2006; Potnuru & Sahoo, 2016; Sung & Choi, 2014). Strategic HRD studies make the same point: training is more likely to produce measurable gains when it is integrated with organisational needs, supported by managers, and connected to follow-up mechanisms that help employees use what they have learned (Garavan, 2007; Jacobs & Washington, 2003; Swanson, 2001).
This study is anchored in that distinction between the promise of training and the conditions required for it to work. The descriptive findings showed moderate levels of both training and development and organisational performance. Within the training construct, the clearest strength was the perceived contribution of training to competence and day-to-day work quality, while equitable access to training opportunities and post-training support were weaker areas. This suggests that training existed in the ministries as an institutional practice, but that its reach and follow-through were uneven.
The qualitative findings sharpen this interpretation. Respondents did not question the value of training itself. Instead, they focused on the conditions under which training was most likely to improve organisational performance. Structured induction, job-relevant learning, fair access across cadres, post-training coaching and tools, regular training-needs assessment, and simple monitoring of outcomes emerged as the most consistent themes. The issue, therefore, is not whether training matters, but how it is structured, delivered, and sustained in practice.
The inferential findings strengthen the same conclusion. Training was positively and significantly related to organisational performance, and it remained a meaningful predictor even after controlling for other strategic human resource development factors and employee motivation. In that sense, training emerges as a credible driver of organisational performance, although its effect depends less on the existence of training alone and more on the quality of implementation.
On that basis, this study examined the effect of training and development on organisational performance in Kenya’s national government state departments. By focusing on implementation quality, the study moves beyond general claims about the importance of employee development and offers context-specific evidence on how training contributes to organisational performance in the public sector.
2. Materials and Methods
2.1. Study Design
This study adopted a cross-sectional mixed-methods design to examine the relationship between training and organizational performance in Kenya’s national government state departments. The design combined a dominant quantitative strand with a complementary qualitative strand to generate both statistical evidence and contextual explanation. The quantitative component provided the primary basis for testing the effect of training on organizational performance, while the qualitative component enriched interpretation by capturing respondents’ views on induction, job-relevant training, equitable access, post-training support, training needs assessment, and monitoring of training outcomes in the public service. The design was appropriate because the study sought not only to determine whether training and development influences organizational performance, but also to explain the institutional conditions through which that relationship is experienced.
2.2. Participants
The study was conducted among public officers serving in Kenya’s national government state departments and allocation units domiciled in Nairobi. Participants were drawn from managerial, supervisory, technical, professional, administrative, and officer-level cadres because these categories are involved in, or directly affected by, training, induction, skills upgrading, supervision, work-process improvement, and service delivery.
The verified sampling frame comprised 14,926 officers distributed across 21 state departments and allocation units: Immigration Citizen Services, Transport, Livestock Development, Energy, Education-Vocational and Technical Training, Health, Public Health, Lands and Physical Planning, Social Protection and Senior Citizenship, Trade, Youth Affairs, Devolution, Irrigation, ICT-Broadcasting, Tourism and Wildlife, Micro and Small Enterprise, Gender and Affirmative Action, Correctional Services, Labour, Foreign Affairs, and Roads.
Yamane’s finite population formula at a 5% margin of error produced a minimum sample of 390 respondents. The minimum sample was allocated proportionately across the 21 departments using each department’s verified staff count. To reduce the effect of non-response and incomplete returns, 530 questionnaires were administered; 426 were returned, all were sufficiently completed, and all were retained for analysis. The final analytical sample was therefore N = 426, representing an 80.4% field response rate and 109.2% of the Yamane minimum sample.
Proportional allocation was applied at the departmental level so that larger departments contributed more respondents than smaller departments, while all sampled departments remained represented. During administration, questionnaires were distributed across available functional areas and cadres to capture managerial, supervisory, technical, administrative, and officer-level perspectives within the practical access constraints of a working public-service environment. The sampling frame, proportional allocation, and valid returns are summarised in Table 1.
Table 1. Sampling frame, proportional allocation, and valid returns.
Department/allocation unit |
Verified staff N_h |
Yamane n_h |
Issued |
Returned/valid r_h |
Final weight N_h/r_h |
Immigration Citizen Services |
2879 |
75 |
102 |
82 |
35.11 |
Transport |
747 |
20 |
27 |
21 |
35.57 |
Livestock Development |
1934 |
51 |
69 |
55 |
35.16 |
Energy |
250 |
7 |
9 |
7 |
35.71 |
Education-Vocational and Technical Training |
1294 |
34 |
46 |
37 |
34.97 |
Health |
1309 |
34 |
46 |
37 |
35.38 |
Public Health |
1236 |
32 |
44 |
35 |
35.31 |
Lands and Physical Planning |
1101 |
29 |
39 |
32 |
34.41 |
Social Protection and Senior Citizenship |
650 |
17 |
23 |
19 |
34.21 |
Trade |
273 |
7 |
10 |
8 |
34.12 |
Youth Affairs |
247 |
6 |
9 |
7 |
35.29 |
Devolution |
177 |
5 |
6 |
5 |
35.40 |
Irrigation |
171 |
4 |
6 |
5 |
34.20 |
ICT-Broadcasting |
216 |
6 |
8 |
6 |
36.00 |
Tourism and Wildlife |
148 |
4 |
5 |
4 |
37.00 |
Micro and Small Enterprise |
121 |
3 |
4 |
4 |
30.25 |
Gender and Affirmative Action |
118 |
3 |
4 |
3 |
39.33 |
Correctional Services |
325 |
8 |
12 |
9 |
36.11 |
Labour |
517 |
13 |
18 |
15 |
34.47 |
Foreign Affairs |
1054 |
28 |
37 |
30 |
35.13 |
Roads |
159 |
4 |
6 |
5 |
31.80 |
Total |
14,926 |
390 |
530 |
426 |
- |
Note. Table 1 clarifies the sampling frame and allocation procedure and is numbered sequentially to meet the publisher’s formatting requirement.
2.3. Measures
Data were collected using a structured questionnaire comprising closed-ended Likert-scale items and one open-ended item. For this article, training was measured using seven indicators: training needs assessment, induction and orientation, training relevance, equitable access to training opportunities, post-training support, evaluation of training outcomes, and contribution of training to competence and day-to-day work quality.
Organisational performance was measured using six indicators aligned with the thesis framework and public-sector performance expectations: efficiency, effectiveness, service quality, innovation, accountability, and transparency. As shown in Table 2, each indicator is defined and linked to an example item to clarify how the outcome construct was operationalised.
All closed-ended items were rated on a five-point Likert scale ranging from 1 = strongly disagree to 5 = strongly agree. Because the items were positively framed, reverse coding was not required. Composite scores were computed using the mean-of-items approach to preserve the original scale metric and allow each construct to be interpreted as an average level of agreement. For descriptive interpretation, composite scores were classified as low (1.00 - 2.49), moderate (2.50 - 3.49), and high (3.50 - 5.00), while for inferential analysis they were treated as continuous variables.
Table 2. Organisational performance indicators and example items.
Indicator |
Operational meaning in this study |
Example questionnaire item |
Efficiency |
Prudent and timely use of human, financial, and operational resources. |
Our State Department uses available resources efficiently to deliver services. |
Effectiveness |
Achievement of planned targets, mandates, and performance-contract obligations. |
Our State Department achieves its planned performance targets within expected timelines. |
Service quality |
Reliability, responsiveness, and usefulness of services provided to internal or external clients. |
Our State Department provides reliable and responsive services to clients and stakeholders. |
Innovation |
Improvement of work processes, adoption of new methods, and problem-solving capacity. |
Our State Department encourages improved ways of performing tasks and solving service-delivery problems. |
Accountability |
Adherence to rules, responsible use of authority, and answerability for results. |
Our State Department promotes accountability in the execution of duties and use of public resources. |
Transparency |
Openness in procedures, information sharing, and clarity of decision-making processes. |
Our State Department communicates procedures, decisions, and performance expectations clearly. |
Note: The examples show the type of positively framed Likert item used to operationalise each performance dimension.
2.4. Validity and Reliability
The instrument was pilot tested before the main survey to assess clarity, wording, coherence, item flow, and ease of completion. Minor refinements were made to improve wording, layout, and response instructions, while the substantive constructs were retained. Face validity was assessed through participant feedback during piloting, while content validity was strengthened through expert review by academic supervisors and public-sector HRD specialists to confirm the representativeness and relevance of the items.
Construct validity was assessed using exploratory factor analysis with principal component extraction and Varimax rotation. The decision criteria specified for construct validity were eigenvalues greater than 1.0, factor loadings of at least 0.50 on the intended factor, cross-loadings below 0.40, communalities of at least 0.40, a Kaiser-Meyer-Olkin value of 0.60 or above, and a statistically significant Bartlett’s test of sphericity at p < 0.05. The instrument demonstrated strong suitability for factor analysis, with a Kaiser-Meyer-Olkin value of 0.936 and a significant Bartlett’s test of sphericity, χ2(666) = 8777.85, p < 0.001. Principal component analysis supported the conceptual structure. Harman’s single-factor test was also used to assess common method variance, with the decision rule that a first factor accounting for more than 50% of the total variance would indicate a potential problem. In this study, the first factor accounted for 39.1% of the total variance, indicating that common method bias was not dominant.
Reliability was assessed using Cronbach’s alpha, with coefficients of 0.70 or higher considered acceptable. The training scale achieved a Cronbach’s alpha of 0.918, confirming good internal consistency.
2.5. Procedure
Data collection followed a structured and ethically guided process. Prior to fieldwork, the study obtained institutional approvals and permissions from the relevant authorities and participating departments. The questionnaire was administered to eligible respondents using a hybrid approach that combined physical distribution and digital delivery, supported by follow-up communication to improve response rates. Each questionnaire was accompanied by participant information and consent details to ensure voluntary participation, confidentiality, and informed decision-making. Completed questionnaires were checked for completeness and consistency before being entered into the analysis dataset, and only correctly completed instruments were retained.
Before inferential analysis, the dataset was screened for completeness, range errors, duplicate cases, and multivariate outliers. No missing values, out-of-range entries, or duplicate cases were identified in the variables used for the analysis. A small number of multivariate outliers were flagged using Mahalanobis distance, but these cases were retained because their response patterns were plausible and reflected normal variation within the study population. Ethical principles were observed throughout the study. Participation was voluntary, informed consent was obtained, and anonymity and confidentiality were maintained during data collection, analysis, storage, and reporting.
2.6. Data Analysis
Quantitative data were analyzed using descriptive and inferential statistics, while qualitative responses from the open-ended item were analyzed thematically. Data preparation involved editing, coding, entry, cleaning, and screening for completeness, consistency, range errors, duplicate cases, missing values, and multivariate outliers. Descriptive statistics, specifically frequencies, percentages, means, and standard deviations, were used to summarize respondent characteristics and the levels of training and organizational performance. Composite scores derived from Likert-scale items were treated as continuous variables for inferential analysis, consistent with established practice in social science research where aggregated ordinal items approximate interval-level measurement.
2.7. Qualitative Analysis
Qualitative evidence came from one open-ended item included in the questionnaire. The exact item asked: “In your view, what should be done to improve training so that it contributes more effectively to organisational performance in your State Department?” This question was appropriate because it invited respondents to explain the practical conditions through which training affected performance.
Responses were analysed thematically through five steps: familiarisation with all responses, open coding of meaningful statements, grouping of similar codes into preliminary categories, consolidation of categories into broader themes, and frequency counting to show the relative prominence of each theme. The main themes included induction and orientation, training relevance and needs assessment, equitable access, post-training support, monitoring and evaluation, and continuous learning.
More than one theme could be assigned to a single response where the response contained multiple distinct ideas. For example, a response calling for “fair training access and follow-up coaching” was coded under both equitable access and post-training support. This explains why the qualitative theme percentages are not expected to sum to 100%. The qualitative findings were then integrated with the quantitative results through joint display and narrative interpretation.
2.8. Diagnostic Tests and Decision Criteria
A series of diagnostic tests was specified to assess the assumptions of regression analysis before final hypothesis testing. Normality of residuals was assessed using the Shapiro-Wilk test, histograms, and Q-Q plots. Linearity and model form were examined using residual-versus-fitted plots and the Ramsey RESET test. Multicollinearity was assessed using variance inflation factor and tolerance statistics, with VIF values below 5 and tolerance values above 0.20 treated as acceptable for the final controlled model.
Homoscedasticity was assessed using residual plots, the Breusch-Pagan test, and the White test. Where non-constant error variance was detected, heteroskedasticity-robust standard errors were used. Accordingly, the final multiple regression results report robust standard errors, p-values, and confidence intervals.
The RESET-test results were reconciled by distinguishing the simple objective-specific model from the final controlled model. In the simple training-only diagnostic model, the RESET result indicated specification caution, suggesting that a bivariate linear model did not capture all relevant structure in organisational performance. In the final theoretically specified model, the inclusion of career development, knowledge management, organisational development, and employee motivation, together with robust inference, provided the basis for interpretation. The RESET result was therefore treated as a caution for interpretation rather than as a reason to discard the model.
Independence of errors was assessed using the Durbin-Watson statistic, while influential observations were reviewed using Cook’s distance, leverage values, and externally studentised residuals. Cases exceeding review thresholds were inspected rather than removed automatically, because plausible variation across departments and cadres was expected in the public-service setting.
2.9. Correlation Analysis
The relationship between training and organizational performance was first examined using Pearson’s product-moment correlation coefficient. The bivariate correlation model was expressed as:
where T denotes the composite score for training and development and OP denotes the composite score for organizational performance. The test was conducted to determine the direction and strength of association between the two variables. The decision criterion for statistical significance was set at p < 0.05. A positive coefficient indicated that higher levels of training and development were associated with higher organizational performance, while a negative coefficient indicated an inverse association. In interpreting magnitude, coefficients close to zero were treated as weak, intermediate values as moderate, and larger coefficients as strong associations.
2.10. Bivariate Regression Model
To estimate the direct effect of training on organizational performance in isolation, a simple linear regression model was specified as:
where OP represents organizational performance, β0 is the intercept, β1 is the regression coefficient for training, T is the independent variable measuring training and development, and ε is the error term. In this model, organizational performance was the dependent variable and training was the independent variable. The decision criteria were as follows: the null hypothesis was rejected if the coefficient for training was statistically significant at p < 0.05; the sign of β1 indicated the direction of the effect; the coefficient of determination (R2) indicated the proportion of variance in organizational performance explained by training; and the F-statistic was used to determine whether the model as a whole was statistically significant at p < 0.05. A positive and significant β1 implied that an increase in training and development was associated with an increase in organizational performance.
2.11. Multiple Regression Model
To estimate the net effect of training while controlling for other Strategic Human Resource Development variables, a multiple linear regression model was specified as:
where OP denotes organisational performance, TD training, CD career development, KM knowledge management, OD organisational development, M employee motivation, β0 the intercept, β1 to β5 the regression coefficients, and ε the error term. The key coefficient for this article was β1, which represented the unique effect of training on organisational performance after accounting for the other predictors. R2, adjusted R2, the overall F-test, robust standard errors, confidence intervals, p-values, VIF values, and tolerance values were reported for the final model.
2.12. Hypothesis Decision Criteria
For all inferential tests, statistical significance was set at p < 0.05. Accordingly, the null hypothesis was rejected where the p-value for the relevant coefficient or test statistic was less than 0.05 and was not rejected where the p-value was equal to or greater than 0.05. For correlation analysis, a statistically significant Pearson coefficient indicated a meaningful association between training and organizational performance. For bivariate and multiple regression, a statistically significant regression coefficient for training indicated that training and development had a significant effect on organizational performance. The combined use of descriptive statistics, correlation analysis, regression modelling, diagnostic testing, and thematic analysis strengthened the rigour of the study and ensured that the reported findings were statistically defensible and substantively interpretable.
3. Results
3.1. Descriptive Findings
Table 3 presents the descriptive results for Training (T) and Organizational Performance (OP) based on 426 valid responses. Overall, both constructs were rated at a moderate level, with composite means of 3.13 (SD = 1.03) for T and 3.38 (SD = 0.93) for OP, indicating moderate perceptions of training practices and performance outcomes across the ministries. All 426 valid responses were retained for analysis.
Within the TD construct, the clearest strength was the perceived contribution of training to competence and day-to-day work quality (TD7), which was the only item rated high. By contrast, equitable access to training opportunities (T4) and post-training support for applying acquired skills (T5) recorded the lowest means, pointing to weaker provision in these areas. The remaining T items fell within the moderate range, suggesting that training practices were present but uneven in coverage and follow-through.
OP items were more uniform, with all means falling within a narrow moderate range. This indicates relatively consistent perceptions of organisational performance across the ministries, without marked variation across performance indicators.
The item-level pattern shown in Table 3 indicates that T scores cluster around the moderate range, with TD7 standing out as the strongest item and T4 and T5 as the weakest. Overall, the descriptive results suggest that training and development was moderately established, but less robust in equitable access and post-training support.
Table 3. Descriptive statistics for Training (T) and Organizational Performance (OP) (N = 426).
Measure |
Mean |
Std. Dev. |
Min |
Max |
Interpretation |
T1 |
3.09 |
1.31 |
1.0 |
5.0 |
Moderate |
T2 |
3.14 |
1.3 |
1.0 |
5.0 |
Moderate |
T3 |
3.25 |
1.24 |
1.0 |
5.0 |
Moderate |
T4 |
2.86 |
1.25 |
1.0 |
5.0 |
Moderate |
T5 |
2.96 |
1.24 |
1.0 |
5.0 |
Moderate |
T6 |
3.06 |
1.2 |
1.0 |
5.0 |
Moderate |
T7 |
3.57 |
1.27 |
1.0 |
5.0 |
High |
T (Composite) |
3.13 |
1.03 |
1.0 |
5.0 |
Moderate |
OP1 |
3.37 |
1.09 |
1.0 |
5.0 |
Moderate |
OP2 |
3.43 |
1.04 |
1.0 |
5.0 |
Moderate |
OP3 |
3.38 |
1.09 |
1.0 |
5.0 |
Moderate |
OP4 |
3.33 |
1.12 |
1.0 |
5.0 |
Moderate |
OP5 |
3.37 |
1.06 |
1.0 |
5.0 |
Moderate |
OP6 |
3.41 |
1.07 |
1.0 |
5.0 |
Moderate |
OP (Composite) |
3.38 |
0.93 |
1.0 |
5.0 |
Moderate |
Descriptive note: Training (T) mean scores by item and composite; TD mean scores show moderate agreement overall, with the highest score on perceived contribution to competence (T7) and lower scores on equity of access (T4) and post-training support (T5).
3.2. Qualitative Findings on Training and Development and Organizational Performance
Table 4 presents the qualitative findings from the open-ended T8 responses. Overall, respondents converged on a clear set of conditions under which training was perceived to improve organisational performance: structured induction, job-relevant training, equitable access across cadres and units, post-training support, regular needs assessment, and simple monitoring of training outcomes.
Across the responses, participants did not question the value of training itself. Rather, they emphasised that its effect depends on how well it is implemented and supported. Structured induction was linked to quicker adjustment, clearer role expectations, and fewer early mistakes. Training relevance was associated with improved service quality and responsiveness, while equitable access emerged as a recurring concern, particularly where opportunities appeared uneven across staff categories and units. Respondents also stressed the importance of coaching, tools, and time for applying learning after training.
Table 5 further shows the relative prominence of these themes. The most frequently cited issues were training relevance and needs assessment, and induction and orientation. Monitoring, evaluation, and implementation accountability also featured strongly, followed by general training provision, post-training support, and fairness in access. This pattern indicates that respondents were primarily concerned with the practical design, targeting, and follow-through of training.
Overall, the qualitative results show that training was perceived to contribute more strongly to performance where it was relevant, fairly distributed, and supported after delivery. Where these conditions were weak, training was described as less likely to produce clear operational gains.
Table 4 presents the dominant themes emerging from TD8 and links each theme to its likely implication for organisational performance. Illustrative excerpts are short and anonymised.
Table 4. Qualitative theme matrix for Training (T8).
Theme |
Summary of what respondents reported |
Illustrative quote (short) |
Implication for performance |
Induction and orientation |
Frequent calls for structured onboarding before role assumption. |
“Mandatory induction on expectations and procedures before deployment.” |
Improves early competence, role clarity, and error reduction. |
Training relevance to job |
Need for training aligned to current duties and emerging skills. |
“Revise courses to match current tasks and emerging demands.” |
Enhances service quality, responsiveness, and practical application of learning. |
Equitable access and coverage |
Concerns about unequal access across cadres and units. |
“Ensure every cadre has fair access through an annual training plan.” |
Improves fairness, morale, and breadth of capability development. |
Post-training support and coaching |
Requests for follow-up coaching, tools, and time to apply skills. |
“Provide structured coaching and tools so skills are applied after training.” |
Strengthens transfer of
training to workplace
performance. |
Training needs assessment |
Calls for regular, systematic needs assessment to prioritise courses. |
“Conduct annual training needs assessments to guide targeted courses.” |
Directs resources to high-impact skill gaps and operational priorities. |
Monitoring and evaluation |
Need for before-and-after measures and training impact tracking. |
“Measure impact using before/after performance and service indicators.” |
Supports accountability, learning improvement, and evidence-based training decisions. |
Table 5. Condensed qualitative theme summary.
Theme |
Condensed finding |
Freq. |
% |
Training relevance and needs assessment |
The most frequently expressed concern was the need to align training more closely with job requirements and priority skill gaps. |
101 |
31.7 |
Induction and orientation |
Structured onboarding was repeatedly linked to faster adjustment, clearer expectations, and fewer early mistakes. |
101 |
31.7 |
Monitoring, evaluation and implementation accountability |
Respondents called for training impact to be measured and for HRD units or line managers to take clearer responsibility for implementation follow-through. |
81 |
25.4 |
General training provision and continuous learning |
Many respondents called for more regular training, indicating unmet demand for continuous staff development. |
70 |
21.9 |
Post-training support and coaching |
Respondents stressed that coaching, tools, and follow-up are necessary if training is to translate into improved performance. |
60 |
18.8 |
Access and fairness in coverage |
Staff highlighted the need for broader and fairer access to training across cadres, units, and employees. |
48 |
15.0 |
3.3. Integrated Qualitative and Quantitative Findings on Training and Organizational Performance
Table 6 presents the integrated qualitative and quantitative findings. Overall, the two strands showed strong alignment. Quantitative results indicated that Training was moderately developed, with training’s contribution to competence and work quality emerging as the strongest area, while equitable access and post-training support remained weaker. The qualitative findings mirrored this pattern by identifying induction, access, relevance, follow-up support, and monitoring as the conditions that shape the performance value of training.
The clearest convergence concerned equitable access and post-training support. These were among the weaker-rated quantitative dimensions and were also repeatedly identified in the qualitative responses as practical gaps limiting the effectiveness of training. By contrast, the strongest quantitative result, the contribution of training to competence and work quality, was reinforced by qualitative accounts linking job-relevant training to better responsiveness, clearer task execution, and improved service quality.
Taken together, the integrated findings indicate that training was present within the ministries, but its performance effect depended on implementation quality. The evidence shows that training was more likely to support organisational performance when it was relevant to work demands, fairly distributed, supported after delivery, and followed up through simple monitoring mechanisms.
Table 6. Joint display integrating Training (T) and Organizational Performance (OP) findings.
Key quantitative result |
Qualitative explanation |
Integration |
Meta-inference |
Overall T composite mean indicates moderate practice. |
Respondents emphasized practical gaps, especially induction, access, and post-training support. |
Complement |
T exists, but its performance value depends on consistency and follow-through. |
Equitable access to training is one of the weaker-rated dimensions. |
Frequent concern that training opportunities are uneven across cadres and units. |
Convergence |
Improving fairness in access is a high-leverage reform for strengthening performance. |
Post-training support is comparatively weak and variable. |
Respondents repeatedly called for coaching, tools, and time to apply learning. |
Convergence |
Transfer-of-training mechanisms are needed if training is to influence service outcomes. |
Training contributes most strongly to competence and work quality. |
Respondents linked job-relevant training to better responsiveness, clearer execution, and improved service quality. |
Convergence |
When training is relevant to actual work, it is more likely to improve day-to-day performance. |
Table 6 integrates the quantitative and qualitative findings for Objective One and shows where the two strands converge or complement each other in explaining the performance value of training and development.
3.4. Diagnostic Testing Results
Before hypothesis testing, diagnostic tests were conducted on the core linear regression model in which organizational performance (OP) was regressed on training (T). The model was specified as: OP = β0 + β1T + ε, where OP denotes organizational performance, T denotes training and, β0 is the intercept, β1 is the regression coefficient for training, and ε is the error term. The diagnostics were based on the 426 returned and retained cases in the dataset.
Step 1: Normality of residuals
Residual normality was assessed using both formal and visual procedures. The Shapiro-Wilk test indicated a statistically significant departure from strict normality (W = 0.986, p < 0.001). The Jarque-Bera test showed the same pattern (JB = 8.522, p = 0.014). The residuals showed mild negative skewness (skew = −0.321) and slight leptokurtosis (kurtosis = 3.285). These results indicate mild non-normality rather than a severe departure.
Step 2: Linearity and model specification
Linearity was assessed through the residual pattern and the Ramsey RESET test. The RESET test was statistically significant (F = 18.652, p < 0.001), suggesting some evidence of specification error or omitted non-linear structure. Even so, the residual plot still showed a broadly monotonic positive relationship, so the linear model was retained as a useful baseline while the specification concern was explicitly noted.
Step 3: Multicollinearity
Because this was a simple linear regression model with only one predictor, multicollinearity was not a substantive concern. The variance inflation factor for training was VIF = 1.000, with a corresponding tolerance of 1.000, confirming the absence of collinearity problems.
Step 4: Homoscedasticity
Homoscedasticity was assessed using both the Breusch-Pagan and White tests. The Breusch-Pagan test was statistically significant (LM = 9.032, p = 0.003), indicating evidence of heteroskedasticity. The White test similarly suggested non-constant error variance (LM = 9.568, p = 0.008). These findings imply that the residual variance was not fully constant across the fitted values.
Step 5: Independence of errors
Independence of errors was assessed using the Durbin-Watson statistic. The obtained value (DW = 1.913) was close to 2.0, indicating no evidence of serial correlation. The residuals were therefore considered independent.
Step 6: Influential cases and outliers
Potentially influential observations were examined using Cook’s distance, leverage values, and externally studentised residuals. Using the threshold of 4/N = 0.0094, 27 cases were flagged by Cook’s distance, with a maximum Cook’s distance of 0.026. Using the leverage threshold of 2(k + 1)/N = 0.0094, 44 cases were flagged, with a maximum leverage value of 0.013. Based on the criterion of |studentised residual| > 3, 2 cases were flagged, with the largest absolute studentised residual being 3.258. These cases were retained for analysis but noted for sensitivity review.
Overall Diagnostic Conclusion
Taken together, the diagnostic results showed that the core linear regression model was usable for inferential analysis, but with caution. The assumptions of independence of errors and absence of multicollinearity were satisfied. However, the residuals showed mild departures from strict normality, there was evidence of heteroskedasticity, and the RESET test suggested some model specification risk. Accordingly, the model remains informative for testing the effect of training and development on organizational performance, but interpretation should acknowledge these diagnostic limitations and, where possible, should be supported by robust standard errors or sensitivity checks.
The detailed diagnostic tables are cited together in Tables 7-13, and the diagnostic plots are cited together in Figures 1-5.
Table 7. Preliminary diagnostic test summary.
Assumption/Test |
Statistic/Threshold |
Result |
Decision |
Implication for analysis |
Normality of residuals (Shapiro-Wilk) |
p > 0.05 |
W = 0.986, p < 0.001 |
Minor departure |
Residuals not strictly normal, but violation is mild |
Normality of residuals (Jarque-Bera) |
p > 0.05 |
JB = 8.522, p = 0.014 |
Minor departure |
Large-sample inference remains feasible with caution |
Linearity (Ramsey RESET) |
p > 0.05 |
F = 18.652, p < 0.001 |
Review |
Possible omitted non-linear structure |
Multicollinearity (VIF) |
VIF < 10 |
1.000 |
Accept |
No collinearity concern |
Tolerance |
>0.10 |
1.000 |
Accept |
Predictor is statistically stable |
Homoscedasticity (Breusch-Pagan) |
p > 0.05 |
LM = 9.032, p = 0.003 |
Review |
Evidence of heteroskedasticity |
Homoscedasticity (White test) |
p > 0.05 |
LM = 9.568, p = 0.008 |
Review |
Error variance may not be constant |
Independence of errors (Durbin-Watson) |
≈2.0 |
1.913 |
Accept |
Serial correlation not indicated |
Influential cases (Cook’s distance) |
>0.0094 flagged |
27 cases; max = 0.026 |
Review |
Retain unless distortion is confirmed |
High leverage |
>0.0094 flagged |
44 cases; max = 0.013 |
Review |
Check alongside residuals |
Outlying residuals |
|t| > 3 flagged |
2 cases; max |t| = 3.258 |
Review |
Inspect, but do not remove automatically |
Caption: Summary of the diagnostic tests conducted before inferential analysis of the simple linear regression model for Objective 1.
Table 8. Normality and residual distribution statistics.
Indicator |
Observed value |
Shapiro-Wilk W |
0.986 |
Shapiro-Wilk p-value |
<0.001 |
Jarque-Bera statistic |
8.522 |
Jarque-Bera p-value |
0.014 |
Residual skewness |
−0.321 |
Residual kurtosis |
3.285 |
Caption: Detailed residual normality statistics for the Objective 1 regression model.
Table 9. Core model fit and specification summary.
Indicator |
Observed value |
Number of complete returned cases |
426 |
Intercept (β0) |
2.106 |
Slope for training(β1) |
0.403 |
R-squared |
0.192 |
Model F-statistic |
101.039 |
Model p-value |
<0.001 |
Ramsey RESET F-statistic |
18.652 |
Ramsey RESET p-value |
<0.001 |
Caption: Baseline fit statistics for the training and development model predicting organizational performance.
Table 10. Homoscedasticity diagnostic summary.
Indicator |
Observed value |
Breusch-Pagan LM statistic |
9.032 |
Breusch-Pagan p-value |
0.003 |
White LM statistic |
9.568 |
White p-value |
0.008 |
Interpretation |
Evidence of non-constant variance |
Caption: Formal tests of constant error variance for the Objective 1 regression model.
Table 11. Independence and collinearity checks.
Indicator |
Observed value |
Durbin-Watson statistic |
1.913 |
Independence decision |
Accept |
Variance inflation factor (VIF) |
1.000 |
Tolerance |
1.000 |
Multicollinearity decision |
Accept |
Caption: Additional assumption checks for independence of errors and predictor stability.
Diagnostic Figures
Figures 1-5 present the diagnostic plots in their order of appearance: the Q-Q plot, histogram, residuals-versus-fitted plot, scale-location plot, and Cook’s distance plot.
Caption: The Q-Q plot indicates mild departures from strict normality, with most residuals aligning reasonably well around the centre and modest deviations appearing mainly in the tails.
Figure 1. Q-Q plot of regression residuals.
Caption: The residuals are approximately bell-shaped, with mild negative skewness and slight tail thickness.
Figure 2. Histogram of regression residuals.
Caption: The residual pattern suggests a broadly positive linear form, although the smooth curve indicates some remaining specification risk.
Figure 3. Residuals versus fitted values.
Caption: The spread of residuals varies somewhat across fitted values, consistent with the formal heteroskedasticity tests.
Figure 4. Scale-location plot.
Caption: A limited number of observations exceed the 4/N review threshold, but none display extreme influence warranting automatic deletion.
Figure 5. Cook’s distance by observation.
Table 12. Influence diagnostics summary.
Indicator |
Threshold |
Observed result |
Cook’s distance |
>0.0094 |
27 cases flagged; max = 0.026 |
Leverage |
>0.0094 |
44 cases flagged; max = 0.013 |
Externally studentised residuals |
|t| > 3 |
2 cases flagged; max |t| = 3.258 |
Caption: Threshold-based summary of influential observations reviewed in the Objective 1 model.
Table 13. Top observations for influence review.
Questionnaire ID |
TD score |
OP score |
Cook’s d |
Leverage |
Studentised residual |
SHRD-0472 |
1.429 |
4.667 |
0.026 |
0.009 |
2.372 |
SHRD-0360 |
1.000 |
4.167 |
0.026 |
0.013 |
1.981 |
SHRD-0401 |
1.000 |
1.000 |
0.022 |
0.013 |
−1.802 |
SHRD-0144 |
1.000 |
1.000 |
0.022 |
0.013 |
−1.802 |
SHRD-0478 |
1.714 |
4.833 |
0.021 |
0.007 |
2.432 |
SHRD-0148 |
1.714 |
4.833 |
0.021 |
0.007 |
2.432 |
SHRD-0334 |
1.714 |
4.833 |
0.021 |
0.007 |
2.432 |
SHRD-0438 |
4.000 |
1.000 |
0.020 |
0.004 |
−3.258 |
SHRD-0306 |
4.000 |
1.000 |
0.020 |
0.004 |
−3.258 |
SHRD-0072 |
5.000 |
2.500 |
0.018 |
0.010 |
−1.934 |
Caption: Ten observations with the largest Cook’s distance values in the Objective 1 model.
Interpretive note: This Objective 1 diagnostic package mirrors the structure of the attached Objective 2 career development diagnostics, while reporting the distinct empirical results observed for training in the uploaded dataset.
3.5. Inferential Analysis on Training and Organizational Performance
Inferential analysis was conducted to test whether Training significantly predicted Organizational Performance. First, a bivariate correlation assessed the direction and strength of association between the TD and OP composite scores. Second, a final multiple regression model estimated the unique effect of training on OP while controlling for career development, knowledge management, organisational development, and employee motivation. The final regression results are reported with heteroskedasticity-robust standard errors because the diagnostic tests indicated non-constant variance.
3.5.1. Correlation Analysis
Table 14 presents the bivariate correlation between the composite scores for Training (T) and Organizational Performance (OP). The analysis showed a moderate, positive, and statistically significant relationship between the two variables (r = 0.48, p < 0.001), indicating that ministries reporting stronger training practices also tended to report higher levels of organisational performance. In substantive terms, the result suggests that improvements in training systems are associated with better performance outcomes, although the strength of the relationship is moderate rather than weak or very strong.
Table 14. Correlation between Training (T) and Organisational Performance (OP).
Variables |
Pearson r |
p-value |
Decision (p < 0.05) |
Interpretation (direction/strength) |
TD (Composite) vs OP (Composite) |
0.48 |
<0.001 |
Significant |
Positive (moderate) |
Overall, the correlation result provides initial inferential evidence that training and development is positively associated with organisational performance and therefore merits further examination in a multivariate model.
3.5.2. Regression Analysis
To determine whether Training retained an independent effect on Organisational Performance after accounting for the wider SHRD context, a multiple linear regression model was estimated with OP as the dependent variable and training (TD), career development (CD), knowledge management (KM), organisational development (OD), and employee motivation (M) entered as predictors. The estimated model was:
As shown in Table 15, the final model explained 38.1% of the variance in organisational performance (R2 = 0.381, adjusted R2 = 0.374) and was statistically significant, F(5, 420) = 51.70, p < 0.001. In this model, the coefficient of principal interest was β1, representing the unique effect of training while holding the other predictors constant. Training remained a positive and statistically significant predictor of organisational performance (B = 0.122, beta = 0.132, robust SE = 0.054, p = 0.024, 95% CI [0.016, 0.228]). This confirms that training retained a unique contribution after accounting for related SHRD domains and employee motivation.
Table 15. Regression results for Training and Development (TD) predicting Organisational Performance (OP).
Predictor |
B |
Beta |
Robust SE |
p-value |
95% CI for B |
VIF |
Tolerance |
Interpretation |
Training (TD) |
0.122 |
0.132 |
0.054 |
0.024 |
[0.016, 0.228] |
1.649 |
0.606 |
Significant positive effect |
Career development (CD) |
0.148 |
0.135 |
0.058 |
0.011 |
[0.034, 0.262] |
1.941 |
0.515 |
Significant positive effect |
Knowledge management (KM) |
0.121 |
0.109 |
0.082 |
0.140 |
[−0.040, 0.282] |
1.773 |
0.564 |
Positive but not significant in controlled robust model |
Organisational development (OD) |
0.244 |
0.214 |
0.062 |
<0.001 |
[0.122, 0.366] |
1.436 |
0.696 |
Significant positive effect |
Employee motivation (M) |
0.181 |
0.206 |
0.064 |
0.004 |
[0.056, 0.306] |
2.039 |
0.490 |
Significant positive effect |
Model fit |
R2 = 0.381 |
Adjusted R2 = 0.374 |
- |
F(5, 420) = 51.70;
p < 0.001 |
- |
- |
- |
Overall model significant |
Taken together, the regression findings show that training does not merely correlate with organisational performance at the bivariate level; it also retains a statistically significant positive effect in the controlled model. The VIF values ranged from 1.436 to 2.039 and tolerance values ranged from 0.490 to 0.696, confirming that multicollinearity did not reach a harmful level in the final model.
3.5.3. Hypothesis Testing Decision
Table 16 summarises the hypothesis test for Objective One. Since Training had a positive and statistically significant effect on Organisational Performance in the final controlled model (B = 0.122, beta = 0.132, robust SE = 0.054, p = 0.024), the null hypothesis was rejected. This result indicates that training made a substantive contribution to organisational performance in the sampled ministries and state departments, particularly where training was relevant, fairly accessible, and supported after delivery.
Table 16. Hypothesis decision for H01.
Hypothesis |
Key evidence |
Decision |
Implication |
H01: Training has no significant effect on organisational performance. |
B = 0.122; beta = 0.132; robust SE = 0.054; p = 0.024; R2 = 0.381; adjusted R2 = 0.374; F(5, 420) = 51.70, p < 0.001. |
Reject H01 |
Training has a statistically significant positive effect on organisational performance after controlling for CD, KM, OD, and employee motivation. |
Table 16 summarizes the decision on H01 based on the regression evidence.
Overall, the inferential results show a clear and statistically significant positive relationship between Training and Organizational Performance. The correlation result indicates a moderate positive association, while the final robust multiple regression confirms that Training remains a significant predictor after controlling for career development, knowledge management, organisational development, and employee motivation.
4. Discussion
This study finds that training is a significant driver of organizational performance in Kenya’s national government ministries. Ministries that had more robust training programs reported greater efficiency, responsiveness, target achievement, and overall service quality. This implies that training for public service should not only be considered as a procedural or compliance-based mechanism, but an operational tool through which ministries develop employees’ capabilities in order to increase institutional performance. Meanwhile, the results revealed that the effect of training was statistically significant, but moderate, not transformative, at the same time. Overall training was rated at a moderate level. Its best contribution was in practical competence and the quality of day-to-day work, with equitable access to this training for employees and post-training support also becoming weaker domains. The qualitative findings do not deny this pattern. They argue that performance gains were larger in areas where training was both relevant to job requirements, supported through structured induction practices, fairly distributed evenly across the cadre, kept up through coaching and follow-up, supported by regular needs assessments and some simple progress monitoring. In this case, it’s not just whether training matters; it’s also whether it should be done in a way which is relevant, fair and durable. This is in line with the human resource development literature that emphasizes training as a strategic process through which knowledge, skills and competencies are built to enhance organizational performance. Aguinis and Kraiger (2009) claim that training creates value for persons, teams, organizations and society, whereas Jacobs and Washington (2003) highlight employee development as an organizational activity which is closely related to performance. Likewise, Swanson (2001) positions human resource development as a theory domain and links learning to performance improvement, and Garavan (2007) stresses that HRD is most successful when strategically embedded in broader organizational priorities. When considered together, these works add support to the current discovery that training are a beneficial factor in organizational performance, but its value is contingent on the degree to which learning is integrated in work practice. The current study is also consistent with evidence in that the organizational value of training is conditional as opposed to automatic. Sung and Choi (2014) evidence that training and development contribute to higher learning and innovation levels, and Potnuru and Sahoo (2016) demonstrate that HRD interventions enhance organizational effectiveness due to the enhancement of employee competencies. Relatedly, Ford et al. (2018) argue that transfer of training remains one of the field’s central challenges as learning does not transfer easily from the training environment to the job. This observation is corroborated by the findings here, since respondents consistently found post-training support, coaching, and practical follow-up as preconditions for training enhancement. The study, in conclusion, strengthens the case that training matters for organizational outcomes not just because it “happens” but because it is transferred, supported and implemented effectively at work. One of the significant contributions of the results is that it explains why the effect of training had a positive and moderate degree. While training was significantly related to organizational performance, the weaker dimensions of equitable access and post-training support may indicate implementation shortcomings that constrain training effectiveness. This interpretation is in line with McGraw’s (2014) review of HRD practice in which the extent to which development systems have an impact is greatly contingent upon how closely or poorly they are aligned, organized and maintained in organizational contexts. It is also consistent with Nguyen (2019), who highlights the importance of institutionally designed HRD systems for strengthening staff capability and training effectiveness. In our study, ministries were found to have basic training arrangements but with uneven access, irregular post-delivery support and weak follow-up, this level of training was unlikely to translate to greater service outcomes. This further reinforces insights that training should not be treated as a stand-alone event but as an integral component contributing to the larger system of an organization. Collins and Smith (2006) showed that human resource practices drive performance by facilitating knowledge exchange and combination, and Kehoe and Wright (2013) found that high-performance HR practices influence employee attitudes and behaviors in meaningful ways that inform organizational outcomes. These studies were not conducted in the Kenyan public sector; however, they strengthen the current argument that the value of training is contingent on organizational arrangements that inform the work. The descriptive, qualitative, and inferential results coming together in this essay all agree with each other that ministries already have the structure and performance of training, but that the effectiveness is contingent on whether that structure is targeted, available and promoted strongly enough to impact daily work. More generally, these results imply that training in public organizations is best regarded as a formalized performance mechanism, rather than a stand-alone intervention. When training occurs, it is not effectively organized and it is weakly focused, dispersed poorly, and the training delivered is not sufficiently supported in a post-training manner. In contrast, where it is relevant to workload, supported by follow-up and part of regular organizational routines, it will more likely improve service results. This interpretation accords with the strategic HRD theory by Garavan (2007) and with the findings of Aguinis and Kraiger (2009), Sung and Choi (2014) and Potnuru and Sahoo (2016), who link development systems to the building of capabilities and performance for organizations. The current study thus goes beyond ‘goodness’ claims to illuminate how the impacts of training are often marginal rather than transformational in public sector organizations. On the whole, this discussion provides evidence of the argument that training and development is a positive driver of organizational performance, with the caveat that its value is only maximized to the extent that the ministries create the organizational conditions for learning transfers and ongoing applications. This study will help demonstrate the association between training and performance across national government ministries is not an automatic or symbolic process in nature. The bottom line is that training works best for organizations when it is well designed, closely connected to work needs, just and accompanied by support beyond the training period itself.
Policy Implications
The implications for policy are clear. State departments are not likely to gain much at all simply by increasing chances to be trained. Our stronger, more significant improvements will be in building a better structure for planning, delivering, supporting, and assessing training. Training-needs assessment first, training-needs assessment must be institutionalized, and related directly to service-delivery priorities, such that the learning is aligned to meet actual performance gaps rather than administration scheduling. Second, training opportunities should extend from single cadres to multiple departments to empower all and not be a narrow base. Third, ministries need to make effective adoption of post training implementation via coaching, supervision, hands on tasks, and basic before-and-after metrics of work improvement. Evaluation systems should, finally, expand from attendance and completion rates, to transfer of learning and operational outcomes. Finally, training should be linked to larger HRD systems and should enable it to ensure that short-term work performance and longer-term institutional readiness are supported.
5. Conclusion
In conclusion, the discussion supports that training and development has a credible and statistically proven contribution to the enhancement of organisational performance in Kenya’s national government state departments. Its value is not in question. What is most decisive is the implementation quality. Training is most effective when it is relevant to the demands of work, equitably spaced, supported after delivery, and carried out in practice. Thus, the challenge for state department is more than how to justify the necessity of training on paper for training or on paper rather, it is how to organise training strategically so that it is able to be more widely spread out in the system, deep-seated and long-lasting in practice.
Areas for Future Research
Future research should examine training and organisational performance using longitudinal designs in order to capture change over time and establish stronger causal inference. This would help determine whether improvements in training systems produce sustained gains in organisational performance rather than short-term effects.
Further studies should also incorporate objective performance indicators, such as service turnaround times, complaint data, audit outcomes, and target achievement records, to complement perceptual measures. Such evidence would strengthen the empirical basis for assessing the performance value of training in the public sector.
In addition, future research could extend the analysis to other public-sector settings, including county governments, state corporations, and frontline service agencies, to test whether the same implementation challenges persist across institutional contexts. Comparative studies across sectors would also help clarify the extent to which the present findings are specific to national government ministries or reflect broader public-administration dynamics.