|
| 1 | +Introduction |
| 2 | +============ |
| 3 | + |
| 4 | +8x50 chipset requires the ability to disable HW domain manager function. |
| 5 | + |
| 6 | +The ARM MMU architecture has a feature known as domain manager mode. |
| 7 | +Briefly each page table, section, or supersection is assigned a domain. |
| 8 | +Each domain can be globally configured to NoAccess, Client, or Manager |
| 9 | +mode. These global configurations allow the access permissions of the |
| 10 | +entire domain to be changed simultaneously. |
| 11 | + |
| 12 | +The domain manger emulation is required to fix a HW problem on the 8x50 |
| 13 | +chipset. The problem is simple to repair except when domain manager mode |
| 14 | +is enabled. The emulation allows the problem to be completely resolved. |
| 15 | + |
| 16 | + |
| 17 | +Hardware description |
| 18 | +==================== |
| 19 | + |
| 20 | +When domain manager mode is enabled on a specific domain, the MMU |
| 21 | +hardware ignores the access permission bits and the execute never bit. All |
| 22 | +accesses, to memory in the domain, are granted full read, write, |
| 23 | +execute permissions. |
| 24 | + |
| 25 | +The mode of each domain is controlled by a field in the cp15 dacr register. |
| 26 | +Each domain can be globally configured to NoAccess, Client, or Manager mode. |
| 27 | + |
| 28 | +See: ARMv7 Architecture Reference Manual |
| 29 | + |
| 30 | + |
| 31 | +Software description |
| 32 | +==================== |
| 33 | + |
| 34 | +In order to disable domain manager mode the equivalent HW functionality must |
| 35 | +be emulated in SW. Any attempts to enable domain manager mode, must be |
| 36 | +intercepted. |
| 37 | + |
| 38 | +Because domain manager mode is not enabled, permissions for the |
| 39 | +associated domain will remain restricted. Permission faults will be generated. |
| 40 | +The permission faults will be intercepted. The faulted pages/sections will |
| 41 | +be modified to grant full access and execute permissions. |
| 42 | + |
| 43 | +The modified page tables must be restored when exiting domain manager mode. |
| 44 | + |
| 45 | + |
| 46 | +Design |
| 47 | +====== |
| 48 | + |
| 49 | +Design Goals: |
| 50 | + |
| 51 | +Disable Domain Manager Mode |
| 52 | +Exact SW emulation of Domain Manager Mode |
| 53 | +Minimal Kernel changes |
| 54 | +Minimal Security Risk |
| 55 | + |
| 56 | +Design Decisions: |
| 57 | + |
| 58 | +Detect kernel page table modifications on restore |
| 59 | +Direct ARMv7 HW MMU table manipulation |
| 60 | +Restore emulation modified MMU entries on context switch |
| 61 | +No need to restore MMU entries for MMU entry copy operations |
| 62 | +Invalidate TLB entries on modification |
| 63 | +Store Domain Manager bits in memory |
| 64 | +8 entry MMU entry cache |
| 65 | +Use spin_lock_irqsave to protect domain manipulation |
| 66 | +Assume no split MMU table |
| 67 | + |
| 68 | +Design Discussion: |
| 69 | + |
| 70 | +Detect kernel page table modifications on restore - |
| 71 | +When restoring original page/section permission faults, the submitted design |
| 72 | +verifies the MMU entry has not been modified. The kernel modifies MMU |
| 73 | +entries for the following purposes : create a memory mapping, release a |
| 74 | +memory mapping, add permissions during a permission fault, and map a page |
| 75 | +during a translation fault. The submitted design works with the listed |
| 76 | +scenarios. The translation fault and permission faults simply do not happen on |
| 77 | +relevant entries (valid entries with full access permissions). The alternative |
| 78 | +would be to hook every MMU table modification. The alternative greatly |
| 79 | +increases complexity and code maintenance issues. |
| 80 | + |
| 81 | +Direct ARMv7 HW MMU table manipulation - |
| 82 | +The natural choice would be to use the kernel provided mechanism to manipulate |
| 83 | +MMU page table entries. The ARM MMU interface is described in pgtable.h. |
| 84 | +This interface is complicated by the Linux implementation. The level 1 pgd |
| 85 | +entries are treated and manipulated as entry pairs. The level 2 entries are |
| 86 | +shadowed and cloned. The compromise was chosen to actually use the ARMv7 HW |
| 87 | +registers to walk and modify the MMU table entries. The choice limits the |
| 88 | +usage of this implementation to ARMv7 and similar ARM MMU architectures. Since |
| 89 | +this implementation is targeted at fixing an issue in 8x50 ARMv7, the choice is |
| 90 | +logical. The HW manipulation is in distinct low level functions. These could |
| 91 | +easily be replaced or generalized to support other architectures as necessary. |
| 92 | + |
| 93 | +Restore emulation modified MMU entries on context switch - |
| 94 | +This additional hook was added to minimize performance impact. By guaranteeing |
| 95 | +the ASID will not change during the emulation, the emulation may invalidate each |
| 96 | +entry by MVA & ASID. Only the affected page table entries will be removed from |
| 97 | +the TLB cache. The performance cost of the invalidate on context switch is near |
| 98 | +zero. Typically on context switch the domain mode would also change, forcing a |
| 99 | +complete restore of all modified MMU entries. The alternative would be to |
| 100 | +invalidate the entire TLB every time a table entry is restored. |
| 101 | + |
| 102 | +No need to restore MMU entries for copy operations - |
| 103 | +Operations which copy MMU entries are relatively rare in the kernel. Because |
| 104 | +we modify the level 2 pte entries directly in hardware, the Linux shadow copies |
| 105 | +are left untouched. The kernel treats the shadow copies as the primary pte |
| 106 | +entry. Any pte copy operations would be unaffected by the HW modification. |
| 107 | +On translation section fault, pgd entries are copied from the kernel master |
| 108 | +page table to the current thread page table. Since we restore MMU entries on |
| 109 | +context switch, we guarantee the master table will not contain modifications, |
| 110 | +while faulting on a process local entry. Other read, modify write operations |
| 111 | +occur during permission fault handling. Since we open permission on modified |
| 112 | +entries, these do not need to be restored, because we guarantee these |
| 113 | +permission fault operations will not happen. |
| 114 | + |
| 115 | +Invalidate TLB entries on modification - |
| 116 | +No real choice here. This is more of a design requirement. On permission |
| 117 | +fault, the MMU entry with restricted permissions will be in the TLB. To open |
| 118 | +access permissions, the TLB entry must be invalidated. Otherwise the access |
| 119 | +will permission fault again. Upon restoring original MMU entries, the TLB |
| 120 | +must be invalidated to restrict memory access. |
| 121 | + |
| 122 | +Store Domain Manager bits in memory - |
| 123 | +There was only one alternative here. 2.6.29 kernel only uses 3 of 16 |
| 124 | +possible domains. Additional bits in dacr could be used to store the |
| 125 | +manager bits. This would allow faster access to the manager bits. |
| 126 | +Overall this would reduce any performance impact. The performance |
| 127 | +needs did not seem to justify the added weirdness. |
| 128 | + |
| 129 | +8 entry MMU entry cache- |
| 130 | +The size of the modified MMU entry cache is somewhat arbitrary. The thought |
| 131 | +process is that typically, a thread is using two pointers to perform a copy |
| 132 | +operation. In this case only 2 entries would be required. One could imagine |
| 133 | +a more complicated operation, a masked copy for instance, which would require |
| 134 | +more pointers. 8 pointer seemed to be large enough to minimize risk of |
| 135 | +permission fault thrashing. The disadvantage of a larger cache would simply |
| 136 | +be a longer list of entries to restore. |
| 137 | + |
| 138 | +Use spin_lock_irqsave to protect domain manipulation - |
| 139 | +The obvious choice. |
| 140 | + |
| 141 | +Assume no split MMU table - |
| 142 | +This same assumption is documented in cpu_v7_switch_mm. |
| 143 | + |
| 144 | + |
| 145 | +Power Management |
| 146 | +================ |
| 147 | + |
| 148 | +Not affected. |
| 149 | + |
| 150 | + |
| 151 | +SMP/multi-core |
| 152 | +============== |
| 153 | + |
| 154 | +SMP/multicore not supported. This is intended as a 8x50 workaround. |
| 155 | + |
| 156 | + |
| 157 | +Security |
| 158 | +======== |
| 159 | + |
| 160 | +MMU page/section permissions must be manipulated correctly to emulate domain |
| 161 | +manager mode. If page permission are left in full access mode, any process |
| 162 | +can read associated memory. |
| 163 | + |
| 164 | + |
| 165 | +Performance |
| 166 | +=========== |
| 167 | + |
| 168 | +Performance should be impacted only minimally. When emulating domain manager |
| 169 | +mode, there is overhead added to MMU table/context switches, set_domain() |
| 170 | +calls, data aborts, and prefetch aborts. |
| 171 | + |
| 172 | +Normally the kernel operates with domain != DOMAIN_MANAGER. In this case the |
| 173 | +overhead is minimal. An additional check is required to see if domain manager |
| 174 | +mode is on. This minimal code is added to each of emulation entry points : |
| 175 | +set, data abort, prefetch abort, and MMU table/context switch. |
| 176 | + |
| 177 | +Initial accesses to a MMU protected page/section will generate a permission |
| 178 | +fault. The page will be manipulated to grant full access permissions and |
| 179 | +the access will be retried. This will typically require 2-3 page table |
| 180 | +walks. |
| 181 | + |
| 182 | +On a context switch, all modified MMU entries will be restored. On thread |
| 183 | +resume, additional accesses will be treated as initial accesses. |
| 184 | + |
| 185 | + |
| 186 | +Interface |
| 187 | +========= |
| 188 | + |
| 189 | +The emulation does not have clients. It is hooked to the kernel through a |
| 190 | +small list of functions. |
| 191 | + |
| 192 | +void emulate_domain_manager_set(u32 domain); |
| 193 | +int emulate_domain_manager_data_abort(u32 dfsr, u32 dfar); |
| 194 | +int emulate_domain_manager_prefetch_abort(u32 ifsr, u32 ifar); |
| 195 | +void emulate_domain_manager_switch_mm( |
| 196 | + unsigned long pgd_phys, |
| 197 | + struct mm_struct *mm, |
| 198 | + void (*switch_mm)(unsigned long pgd_phys, struct mm_struct *)); |
| 199 | + |
| 200 | +emulate_domain_manager_set() is the set_domain handler. This replaces the |
| 201 | +direct manipulation of CP15 dacr with a function call. This allows emulation |
| 202 | +to prevent setting dacr manager bits. It also allows emulation to restore |
| 203 | +page/section permissions when domain manger is disabled. |
| 204 | + |
| 205 | +emulate_domain_manager_data_abort() handles data aborts caused by domain |
| 206 | +not being set in HW, and handles section/page manipulation. |
| 207 | + |
| 208 | +emulate_domain_manager_prefetch_abort() is the similar prefetch abort handler. |
| 209 | + |
| 210 | +emulate_domain_manager_switch_mm() handles MMU table and context switches. |
| 211 | +This notifies the emulation that the MMU context is changing. Allowing the |
| 212 | +emulation to restore page table entry permission before switching contexts. |
| 213 | + |
| 214 | + |
| 215 | +Config options |
| 216 | +============== |
| 217 | + |
| 218 | +This option is enable/disable by the EMULATE_DOMAIN_MANAGER_V7 option. |
| 219 | + |
| 220 | + |
| 221 | +Dependencies |
| 222 | +============ |
| 223 | + |
| 224 | +Implementation is for ARMv7, MMU, and !SMP. Targets solving issue for 8x50 |
| 225 | +chipset. |
| 226 | + |
| 227 | + |
| 228 | +User space utilities |
| 229 | +==================== |
| 230 | + |
| 231 | +None |
| 232 | + |
| 233 | + |
| 234 | +Other |
| 235 | +===== |
| 236 | + |
| 237 | +Code is implemented in kernel/arch/arm/mm. |
| 238 | + |
| 239 | + |
| 240 | +arch/arm/mm/emulate_domain_manager.c contains comments. No additional public |
| 241 | +documentation available or planned. |
| 242 | + |
| 243 | + |
| 244 | +Known issues |
| 245 | +============ |
| 246 | + |
| 247 | +No intent to support SMP or non ARMv7 architectures |
| 248 | + |
| 249 | + |
| 250 | +To do |
| 251 | +===== |
| 252 | + |
| 253 | +None |
| 254 | + |
0 commit comments