diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2016-05-16 19:17:22 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-05-16 19:17:22 -0700 |
| commit | d57d39431924d1628ac9b93a2de7f806fc80680a (patch) | |
| tree | 8d630b5b22333a6368beb3531f20ae5c5eb72229 /drivers/cpufreq/intel_pstate.c | |
| parent | 3e21e5dda4907ecb21a124517ab0eb1d176e5231 (diff) | |
| parent | 27c4a1c5ef61b6d4a9aeae68b24419b4319b97ed (diff) | |
| download | linux-d57d39431924d1628ac9b93a2de7f806fc80680a.tar.gz linux-d57d39431924d1628ac9b93a2de7f806fc80680a.tar.bz2 linux-d57d39431924d1628ac9b93a2de7f806fc80680a.zip | |
Merge tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"The majority of changes go into the cpufreq subsystem this time.
To me, quite obviously, the biggest ticket item is the new "schedutil"
governor. Interestingly enough, it's the first new cpufreq governor
since the beginning of the git era (except for some out-of-the-tree
ones).
There are two main differences between it and the existing governors.
First, it uses the information provided by the scheduler directly for
making its decisions, so it doesn't have to track anything by itself.
Second, it can invoke drivers (supporting that feature) to adjust CPU
performance right away without having to spawn work items to be
executed in process context or similar. Currently, the acpi-cpufreq
driver is the only one supporting that mode of operation, but then it
is used on a large number of systems.
The "schedutil" governor as included here is very simple and mostly
regarded as a foundation for future work on the integration of the
scheduler with CPU power management (in fact, there is work in
progress on top of it already). Nevertheless it works and the
preliminary results obtained with it are encouraging.
There also is some consolidation of CPU frequency management for ARM
platforms that can add their machine IDs the the new stub dt-platdev
driver now and that will take care of creating the requisite platform
device for cpufreq-dt, so it is not necessary to do that in platform
code any more. Several ARM platforms are switched over to using this
generic mechanism.
In addition to that, the intel_pstate driver is now going to respect
CPU frequency limits set by the platform firmware (or a BMC) and
provided via the ACPI _PPC object.
The devfreq subsystem is getting a new "passive" governor for SoCs
subsystems that will depend on somebody else to manage their voltage
rails and its support for Samsung Exynos SoCs is consolidated.
The rest is support for new hardware (Intel Broxton support in
intel_idle for one example), bug fixes, optimizations and cleanups in
a number of places.
Specifics:
- New cpufreq "schedutil" governor (making decisions based on CPU
utilization information provided by the scheduler and capable of
switching CPU frequencies right away if the underlying driver
supports that) and support for fast frequency switching in the
acpi-cpufreq driver (Rafael Wysocki)
- Consolidation of CPU frequency management on ARM platforms allowing
them to get rid of some platform-specific boilerplate code if they
are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao,
Marc Gonzalez)
- Support for ACPI _PPC and CPU frequency limits in the intel_pstate
driver (Srinivas Pandruvada)
- Fixes and cleanups in the cpufreq core and generic governor code
(Rafael Wysocki, Sai Gurrappadi)
- intel_pstate driver optimizations and cleanups (Rafael Wysocki,
Philippe Longepe, Chen Yu, Joe Perches)
- cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri
Bhat)
- cpufreq qoriq driver fixes and cleanups (Jia Hongtao)
- ACPI cpufreq driver cleanups (Viresh Kumar)
- Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang,
Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla)
- Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann)
- Fixes and cleanups in the OPP (Operating Performance Points)
framework, mostly related to OPP sharing, and reorganization of
OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla)
- New "passive" governor for devfreq (for SoC subsystems that will
rely on someone else for the management of their power resources)
and consolidation of devfreq support for Exynos platforms, coding
style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham)
- PM core fixes and cleanups, mostly to make it work better with the
generic power domains (genpd) framework, and updates for that
framework (Ulf Hansson, Thierry Reding, Colin Ian King)
- Intel Broxton support for the intel_idle driver (Len Brown)
- cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach)
- ARM cpuidle cleanups (Jisheng Zhang)
- Intel Kabylake support for the RAPL power capping driver (Jacob
Pan)
- AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko
Stuebner)
- Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King,
Mattia Dongili, Thomas Renninger)"
* tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (112 commits)
intel_pstate: Clean up get_target_pstate_use_performance()
intel_pstate: Use sample.core_avg_perf in get_avg_pstate()
intel_pstate: Clarify average performance computation
intel_pstate: Avoid unnecessary synchronize_sched() during initialization
cpufreq: schedutil: Make default depend on CONFIG_SMP
cpufreq: powernv: del_timer_sync when global and local pstate are equal
cpufreq: powernv: Move smp_call_function_any() out of irq safe block
intel_pstate: Clean up intel_pstate_get()
cpufreq: schedutil: Make it depend on CONFIG_SMP
cpufreq: governor: Fix handling of special cases in dbs_update()
PM / OPP: Move CONFIG_OF dependent code in a separate file
cpufreq: intel_pstate: Ignore _PPC processing under HWP
cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table
PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table
cpufreq: tango: Use generic platdev driver
PM / OPP: pass cpumask by reference
cpufreq: Fix GOV_LIMITS handling for the userspace governor
cpupower: fix potential memory leak
PM / devfreq: style/typo fixes
PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus
..
Diffstat (limited to 'drivers/cpufreq/intel_pstate.c')
| -rw-r--r-- | drivers/cpufreq/intel_pstate.c | 303 |
1 files changed, 234 insertions, 69 deletions
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index b230ebaae66c..b76a98dd9988 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -10,6 +10,8 @@ * of the License. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include <linux/kernel.h> #include <linux/kernel_stat.h> #include <linux/module.h> @@ -39,10 +41,17 @@ #define ATOM_TURBO_RATIOS 0x66c #define ATOM_TURBO_VIDS 0x66d +#ifdef CONFIG_ACPI +#include <acpi/processor.h> +#endif + #define FRAC_BITS 8 #define int_tofp(X) ((int64_t)(X) << FRAC_BITS) #define fp_toint(X) ((X) >> FRAC_BITS) +#define EXT_BITS 6 +#define EXT_FRAC_BITS (EXT_BITS + FRAC_BITS) + static inline int32_t mul_fp(int32_t x, int32_t y) { return ((int64_t)x * (int64_t)y) >> FRAC_BITS; @@ -64,12 +73,22 @@ static inline int ceiling_fp(int32_t x) return ret; } +static inline u64 mul_ext_fp(u64 x, u64 y) +{ + return (x * y) >> EXT_FRAC_BITS; +} + +static inline u64 div_ext_fp(u64 x, u64 y) +{ + return div64_u64(x << EXT_FRAC_BITS, y); +} + /** * struct sample - Store performance sample - * @core_pct_busy: Ratio of APERF/MPERF in percent, which is actual + * @core_avg_perf: Ratio of APERF/MPERF which is the actual average * performance during last sample period * @busy_scaled: Scaled busy value which is used to calculate next - * P state. This can be different than core_pct_busy + * P state. This can be different than core_avg_perf * to account for cpu idle period * @aperf: Difference of actual performance frequency clock count * read from APERF MSR between last and current sample @@ -84,7 +103,7 @@ static inline int ceiling_fp(int32_t x) * data for choosing next P State. */ struct sample { - int32_t core_pct_busy; + int32_t core_avg_perf; int32_t busy_scaled; u64 aperf; u64 mperf; @@ -162,6 +181,7 @@ struct _pid { * struct cpudata - Per CPU instance data storage * @cpu: CPU number for this instance data * @update_util: CPUFreq utility callback information + * @update_util_set: CPUFreq utility callback is set * @pstate: Stores P state limits for this CPU * @vid: Stores VID limits for this CPU * @pid: Stores PID parameters for this CPU @@ -172,6 +192,8 @@ struct _pid { * @prev_cummulative_iowait: IO Wait time difference from last and * current sample * @sample: Storage for storing last Sample data + * @acpi_perf_data: Stores ACPI perf information read from _PSS + * @valid_pss_table: Set to true for valid ACPI _PSS entries found * * This structure stores per CPU instance data for all CPUs. */ @@ -179,6 +201,7 @@ struct cpudata { int cpu; struct update_util_data update_util; + bool update_util_set; struct pstate_data pstate; struct vid_data vid; @@ -190,6 +213,10 @@ struct cpudata { u64 prev_tsc; u64 prev_cummulative_iowait; struct sample sample; +#ifdef CONFIG_ACPI + struct acpi_processor_performance acpi_perf_data; + bool valid_pss_table; +#endif }; static struct cpudata **all_cpu_data; @@ -258,6 +285,9 @@ static struct pstate_adjust_policy pid_params; static struct pstate_funcs pstate_funcs; static int hwp_active; +#ifdef CONFIG_ACPI +static bool acpi_ppc; +#endif /** * struct perf_limits - Store user and policy limits @@ -331,6 +361,124 @@ static struct perf_limits *limits = &performance_limits; static struct perf_limits *limits = &powersave_limits; #endif +#ifdef CONFIG_ACPI + +static bool intel_pstate_get_ppc_enable_status(void) +{ + if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER || + acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER) + return true; + + return acpi_ppc; +} + +/* + * The max target pstate ratio is a 8 bit value in both PLATFORM_INFO MSR and + * in TURBO_RATIO_LIMIT MSR, which pstate driver stores in max_pstate and + * max_turbo_pstate fields. The PERF_CTL MSR contains 16 bit value for P state + * ratio, out of it only high 8 bits are used. For example 0x1700 is setting + * target ratio 0x17. The _PSS control value stores in a format which can be + * directly written to PERF_CTL MSR. But in intel_pstate driver this shift + * occurs during write to PERF_CTL (E.g. for cores core_set_pstate()). + * This function converts the _PSS control value to intel pstate driver format + * for comparison and assignment. + */ +static int convert_to_native_pstate_format(struct cpudata *cpu, int index) +{ + return cpu->acpi_perf_data.states[index].control >> 8; +} + +static void intel_pstate_init_acpi_perf_limits(struct cpufreq_policy *policy) +{ + struct cpudata *cpu; + int turbo_pss_ctl; + int ret; + int i; + + if (hwp_active) + return; + + if (!intel_pstate_get_ppc_enable_status()) + return; + + cpu = all_cpu_data[policy->cpu]; + + ret = acpi_processor_register_performance(&cpu->acpi_perf_data, + policy->cpu); + if (ret) + return; + + /* + * Check if the control value in _PSS is for PERF_CTL MSR, which should + * guarantee that the states returned by it map to the states in our + * list directly. + */ + if (cpu->acpi_perf_data.control_register.space_id != + ACPI_ADR_SPACE_FIXED_HARDWARE) + goto err; + + /* + * If there is only one entry _PSS, simply ignore _PSS and continue as + * usual without taking _PSS into account + */ + if (cpu->acpi_perf_data.state_count < 2) + goto err; + + pr_debug("CPU%u - ACPI _PSS perf data\n", policy->cpu); + for (i = 0; i < cpu->acpi_perf_data.state_count; i++) { + pr_debug(" %cP%d: %u MHz, %u mW, 0x%x\n", + (i == cpu->acpi_perf_data.state ? '*' : ' '), i, + (u32) cpu->acpi_perf_data.states[i].core_frequency, + (u32) cpu->acpi_perf_data.states[i].power, + (u32) cpu->acpi_perf_data.states[i].control); + } + + /* + * The _PSS table doesn't contain whole turbo frequency range. + * This just contains +1 MHZ above the max non turbo frequency, + * with control value corresponding to max turbo ratio. But + * when cpufreq set policy is called, it will call with this + * max frequency, which will cause a reduced performance as + * this driver uses real max turbo frequency as the max + * frequency. So correct this frequency in _PSS table to + * correct max turbo frequency based on the turbo ratio. + * Also need to convert to MHz as _PSS freq is in MHz. + */ + turbo_pss_ctl = convert_to_native_pstate_format(cpu, 0); + if (turbo_pss_ctl > cpu->pstate.max_pstate) + cpu->acpi_perf_data.states[0].core_frequency = + policy->cpuinfo.max_freq / 1000; + cpu->valid_pss_table = true; + pr_info("_PPC limits will be enforced\n"); + + return; + + err: + cpu->valid_pss_table = false; + acpi_processor_unregister_performance(policy->cpu); +} + +static void intel_pstate_exit_perf_limits(struct cpufreq_policy *policy) +{ + struct cpudata *cpu; + + cpu = all_cpu_data[policy->cpu]; + if (!cpu->valid_pss_table) + return; + + acpi_processor_unregister_performance(policy->cpu); +} + +#else +static void intel_pstate_init_acpi_perf_limits(struct cpufreq_policy *policy) +{ +} + +static void intel_pstate_exit_perf_limits(struct cpufreq_policy *policy) +{ +} +#endif + static inline void pid_reset(struct _pid *pid, int setpoint, int busy, int deadband, int integral) { pid->setpoint = int_tofp(setpoint); @@ -341,17 +489,17 @@ static inline void pid_reset(struct _pid *pid, int setpoint, int busy, static inline void pid_p_gain_set(struct _pid *pid, int percent) { - pid->p_gain = div_fp(int_tofp(percent), int_tofp(100)); + pid->p_gain = div_fp(percent, 100); } static inline void pid_i_gain_set(struct _pid *pid, int percent) { - pid->i_gain = div_fp(int_tofp(percent), int_tofp(100)); + pid->i_gain = div_fp(percent, 100); } static inline void pid_d_gain_set(struct _pid *pid, int percent) { - pid->d_gain = div_fp(int_tofp(percent), int_tofp(100)); + pid->d_gain = div_fp(percent, 100); } static signed int pid_calc(struct _pid *pid, int32_t busy) @@ -537,7 +685,7 @@ static ssize_t show_turbo_pct(struct kobject *kobj, total = cpu->pstate.turbo_pstate - cpu->pstate.min_pstate + 1; no_turbo = cpu->pstate.max_pstate - cpu->pstate.min_pstate + 1; - turbo_fp = div_fp(int_tofp(no_turbo), int_tofp(total)); + turbo_fp = div_fp(no_turbo, total); turbo_pct = 100 - fp_toint(mul_fp(turbo_fp, int_tofp(100))); return sprintf(buf, "%u\n", turbo_pct); } @@ -579,7 +727,7 @@ static ssize_t store_no_turbo(struct kobject *a, struct attribute *b, update_turbo_state(); if (limits->turbo_disabled) { - pr_warn("intel_pstate: Turbo disabled by BIOS or unavailable on processor\n"); + pr_warn("Turbo disabled by BIOS or unavailable on processor\n"); return -EPERM; } @@ -608,8 +756,7 @@ static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b, limits->max_perf_pct); limits->max_perf_pct = max(limits->min_perf_pct, limits->max_perf_pct); - limits->max_perf = div_fp(int_tofp(limits->max_perf_pct), - int_tofp(100)); + limits->max_perf = div_fp(limits->max_perf_pct, 100); if (hwp_active) intel_pstate_hwp_set_online_cpus(); @@ -633,8 +780,7 @@ static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b, limits->min_perf_pct); limits->min_perf_pct = min(limits->max_perf_pct, limits->min_perf_pct); - limits->min_perf = div_fp(int_tofp(limits->min_perf_pct), - int_tofp(100)); + limits->min_perf = div_fp(limits->min_perf_pct, 100); if (hwp_active) intel_pstate_hwp_set_online_cpus(); @@ -1019,15 +1165,11 @@ static void intel_pstate_get_cpu_pstates(struct cpudata *cpu) intel_pstate_set_min_pstate(cpu); } -static inline void intel_pstate_calc_busy(struct cpudata *cpu) +static inline void intel_pstate_calc_avg_perf(struct cpudata *cpu) { struct sample *sample = &cpu->sample; - int64_t core_pct; - core_pct = int_tofp(sample->aperf) * int_tofp(100); - core_pct = div64_u64(core_pct, int_tofp(sample->mperf)); - - sample->core_pct_busy = (int32_t)core_pct; + sample->core_avg_perf = div_ext_fp(sample->aperf, sample->mperf); } static inline bool intel_pstate_sample(struct cpudata *cpu, u64 time) @@ -1070,9 +1212,14 @@ static inline bool intel_pstate_sample(struct cpudata *cpu, u64 time) static inline int32_t get_avg_frequency(struct cpudata *cpu) { - return fp_toint(mul_fp(cpu->sample.core_pct_busy, - int_tofp(cpu->pstate.max_pstate_physical * - cpu->pstate.scaling / 100))); + return mul_ext_fp(cpu->sample.core_avg_perf, + cpu->pstate.max_pstate_physical * cpu->pstate.scaling); +} + +static inline int32_t get_avg_pstate(struct cpudata *cpu) +{ + return mul_ext_fp(cpu->pstate.max_pstate_physical, + cpu->sample.core_avg_perf); } static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu) @@ -1107,49 +1254,43 @@ static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu) cpu_load = div64_u64(int_tofp(100) * mperf, sample->tsc); cpu->sample.busy_scaled = cpu_load; - return cpu->pstate.current_pstate - pid_calc(&cpu->pid, cpu_load); + return get_avg_pstate(cpu) - pid_calc(&cpu->pid, cpu_load); } static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu) { - int32_t core_busy, max_pstate, current_pstate, sample_ratio; + int32_t perf_scaled, max_pstate, current_pstate, sample_ratio; u64 duration_ns; /* - * core_busy is the ratio of actual performance to max - * max_pstate is the max non turbo pstate available - * current_pstate was the pstate that was requested during - * the last sample period. - * - * We normalize core_busy, which was our actual percent - * performance to what we requested during the last sample - * period. The result will be a percentage of busy at a - * specified pstate. + * perf_scaled is the average performance during the last sampling + * period scaled by the ratio of the maximum P-state to the P-state + * requested last time (in percent). That measures the system's + * response to the previous P-state selection. */ - core_busy = cpu->sample.core_pct_busy; - max_pstate = int_tofp(cpu->pstate.max_pstate_physical); - current_pstate = int_tofp(cpu->pstate.current_pstate); - core_busy = mul_fp(core_busy, div_fp(max_pstate, current_pstate)); + max_pstate = cpu->pstate.max_pstate_physical; + current_pstate = cpu->pstate.current_pstate; + perf_scaled = mul_ext_fp(cpu->sample.core_avg_perf, + div_fp(100 * max_pstate, current_pstate)); /* * Since our utilization update callback will not run unless we are * in C0, check if the actual elapsed time is significantly greater (3x) * than our sample interval. If it is, then we were idle for a long - * enough period of time to adjust our busyness. + * enough period of time to adjust our performance metric. */ duration_ns = cpu->sample.time - cpu->last_sample_time; if ((s64)duration_ns > pid_params.sample_rate_ns * 3) { - sample_ratio = div_fp(int_tofp(pid_params.sample_rate_ns), - int_tofp(duration_ns)); - core_busy = mul_fp(core_busy, sample_ratio); + sample_ratio = div_fp(pid_params.sample_rate_ns, duration_ns); + perf_scaled = mul_fp(perf_scaled, sample_ratio); } else { sample_ratio = div_fp(100 * cpu->sample.mperf, cpu->sample.tsc); if (sample_ratio < int_tofp(1)) - core_busy = 0; + perf_scaled = 0; } - cpu->sample.busy_scaled = core_busy; - return cpu->pstate.current_pstate - pid_calc(&cpu->pid, core_busy); + cpu->sample.busy_scaled = perf_scaled; + return cpu->pstate.current_pstate - pid_calc(&cpu->pid, perf_scaled); } static inline void intel_pstate_update_pstate(struct cpudata *cpu, int pstate) @@ -1179,7 +1320,7 @@ static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu) intel_pstate_update_pstate(cpu, target_pstate); sample = &cpu->sample; - trace_pstate_sample(fp_toint(sample->core_pct_busy), + trace_pstate_sample(mul_ext_fp(100, sample->core_avg_perf), fp_toint(sample->busy_scaled), from, cpu->pstate.current_pstate, @@ -1199,7 +1340,7 @@ static void intel_pstate_update_util(struct update_util_data *data, u64 time, bool sample_taken = intel_pstate_sample(cpu, time); if (sample_taken) { - intel_pstate_calc_busy(cpu); + intel_pstate_calc_avg_perf(cpu); if (!hwp_active) intel_pstate_adjust_busy_pstate(cpu); } @@ -1261,23 +1402,16 @@ static int intel_pstate_init_cpu(unsigned int cpunum) intel_pstate_busy_pid_reset(cpu); - cpu->update_util.func = intel_pstate_update_util; - - pr_debug("intel_pstate: controlling: cpu %d\n", cpunum); + pr_debug("controlling: cpu %d\n", cpunum); return 0; } static unsigned int intel_pstate_get(unsigned int cpu_num) { - struct sample *sample; - struct cpudata *cpu; + struct cpudata *cpu = all_cpu_data[cpu_num]; - cpu = all_cpu_data[cpu_num]; - if (!cpu) - return 0; - sample = &cpu->sample; - return get_avg_frequency(cpu); + return cpu ? get_avg_frequency(cpu) : 0; } static void intel_pstate_set_update_util_hook(unsigned int cpu_num) @@ -1286,12 +1420,20 @@ static void intel_pstate_set_update_util_hook(unsigned int cpu_num) /* Prevent intel_pstate_update_util() from using stale data. */ cpu->sample.time = 0; - cpufreq_set_update_util_data(cpu_num, &cpu->update_util); + cpufreq_add_update_util_hook(cpu_num, &cpu->update_util, + intel_pstate_update_util); + cpu->update_util_set = true; } static void intel_pstate_clear_update_util_hook(unsigned int cpu) { - cpufreq_set_update_util_data(cpu, NULL); + struct cpudata *cpu_data = all_cpu_data[cpu]; + + if (!cpu_data->update_util_set) + return; + + cpufreq_remove_update_util_hook(cpu); + cpu_data->update_util_set = false; synchronize_sched(); } @@ -1311,20 +1453,31 @@ static void intel_pstate_set_performance_limits(struct perf_limits *limits) static int intel_pstate_set_policy(struct cpufreq_policy *policy) { + struct cpudata *cpu; + if (!policy->cpuinfo.max_freq) return -ENODEV; intel_pstate_clear_update_util_hook(policy->cpu); + cpu = all_cpu_data[0]; + if (cpu->pstate.max_pstate_physical > cpu->pstate.max_pstate) { + if (policy->max < policy->cpuinfo.max_freq && + policy->max > cpu->pstate.max_pstate * cpu->pstate.scaling) { + pr_debug("policy->max > max non turbo frequency\n"); + policy->max = policy->cpuinfo.max_freq; + } + } + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { limits = &performance_limits; if (policy->max >= policy->cpuinfo.max_freq) { - pr_debug("intel_pstate: set performance\n"); + pr_debug("set performance\n"); intel_pstate_set_performance_limits(limits); goto out; } } else { - pr_debug("intel_pstate: set powersave\n"); + pr_debug("set powersave\n"); limits = &powersave_limits; } @@ -1348,10 +1501,8 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) /* Make sure min_perf_pct <= max_perf_pct */ limits->min_perf_pct = min(limits->max_perf_pct, limits->min_perf_pct); - limits->min_perf = div_fp(int_tofp(limits->min_perf_pct), - int_tofp(100)); - limits->max_perf = div_fp(int_tofp(limits->max_perf_pct), - int_tofp(100)); + limits->min_perf = div_fp(limits->min_perf_pct, 100); + limits->max_perf = div_fp(limits->max_perf_pct, 100); out: intel_pstate_set_update_util_hook(policy->cpu); @@ -1377,7 +1528,7 @@ static void intel_pstate_stop_cpu(struct cpufreq_policy *policy) int cpu_num = policy->cpu; struct cpudata *cpu = all_cpu_data[cpu_num]; - pr_debug("intel_pstate: CPU %d exiting\n", cpu_num); + pr_debug("CPU %d exiting\n", cpu_num); intel_pstate_clear_update_util_hook(cpu_num); @@ -1410,12 +1561,20 @@ static int intel_pstate_cpu_init(struct cpufreq_policy *policy) policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu->pstate.scaling; policy->cpuinfo.max_freq = cpu->pstate.turbo_pstate * cpu->pstate.scaling; + intel_pstate_init_acpi_perf_limits(policy); policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL; cpumask_set_cpu(policy->cpu, policy->cpus); return 0; } +static int intel_pstate_cpu_exit(struct cpufreq_policy *policy) +{ + intel_pstate_exit_perf_limits(policy); + + return 0; +} + static struct cpufreq_driver intel_pstate_driver = { .flags = CPUFREQ_CONST_LOOPS, .verify = intel_pstate_verify_policy, @@ -1423,6 +1582,7 @@ static struct cpufreq_driver intel_pstate_driver = { .resume = intel_pstate_hwp_set_policy, .get = intel_pstate_get, .init = intel_pstate_cpu_init, + .exit = intel_pstate_cpu_exit, .stop_cpu = intel_pstate_stop_cpu, .name = "intel_pstate", }; @@ -1466,8 +1626,7 @@ static void copy_cpu_funcs(struct pstate_funcs *funcs) } -#if IS_ENABLED(CONFIG_ACPI) -#include <acpi/processor.h> +#ifdef CONFIG_ACPI static bool intel_pstate_no_acpi_pss(void) { @@ -1623,7 +1782,7 @@ hwp_cpu_matched: if (intel_pstate_platform_pwr_mgmt_exists()) return -ENODEV; - pr_info("Intel P-state driver initializing.\n"); + pr_info("Intel P-state driver initializing\n"); all_cpu_data = vzalloc(sizeof(void *) * num_possible_cpus()); if (!all_cpu_data) @@ -1640,7 +1799,7 @@ hwp_cpu_matched: intel_pstate_sysfs_expose_params(); if (hwp_active) - pr_info("intel_pstate: HWP enabled\n"); + pr_info("HWP enabled\n"); return rc; out: @@ -1666,13 +1825,19 @@ static int __init intel_pstate_setup(char *str) if (!strcmp(str, "disable")) no_load = 1; if (!strcmp(str, "no_hwp")) { - pr_info("intel_pstate: HWP disabled\n"); + pr_info("HWP disabled\n"); no_hwp = 1; } if (!strcmp(str, "force")) force_load = 1; if (!strcmp(str, "hwp_only")) hwp_only = 1; + +#ifdef CONFIG_ACPI + if (!strcmp(str, "support_acpi_ppc")) + acpi_ppc = true; +#endif + return 0; } early_param("intel_pstate", intel_pstate_setup); |
