1127000 – Please set default CMA size to at least 64M, for Tegra

Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1127000 - Please set default CMA size to at least 64M, for Tegra

Summary: Please set default CMA size to at least 64M, for Tegra

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Peter Robinson
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	ARMTracker
TreeView+	depends on / blocked

Reported:	2014-08-05 20:12 UTC by Stephen Warren
Modified:	2014-09-23 05:00 UTC (History)
CC List:	11 users (show)
Fixed In Version:	kernel-3.16.2-300.fc21
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-09-23 05:00:09 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Fix for the fedpkg (deleted) 2014-08-05 23:31 UTC, Stephen Warren	no flags	Details \| Diff
View All

Description Stephen Warren 2014-08-05 20:12:56 UTC

Kernel config option CONFIG_CMA_SIZE_MBYTES sets the default CMA size. CMA is used to allocate video/display memory at least on Tegra. If the CMA size is too small, and the output video resolution too large, the tegradrm driver will not be able to allocate surfaces for display. This will e.g. prevent X from starting in both the (pxeboot) installer and an installed system.

In practice, 64M seems to be large enough for a 1080p display, which is as large as we support currently on the single-link HDMI we have in HW.

For reference, here's an upstream patch which implements this in tegra_defconfig:
http://patchwork.ozlabs.org/patch/325880/

An alternative to this is to add "cma=64M" on the kernel command-line, but the user would need to edit the bootloader configuration to do this, which is a bit more painful than the kernel "just working".

Comment 1 Stephen Warren 2014-08-05 23:31:21 UTC

Created attachment 924301 [details]
Fix for the fedpkg

This should fix it. I'm going to build overnight and test tomorrow.

Comment 2 Josh Boyer 2014-08-07 07:05:39 UTC

Someone sent this to the fedora kernel list a while ago.  Peter had questions on it and they were never answered.  Nicolas just reposted the patch again recently without answering the questions as well.

I'll let you guys work this out.  Let me know when there's consensus.

Comment 3 Stephen Warren 2014-08-07 16:41:00 UTC

FWIW, I've now tested this patch on top of Fedora kernel 3.17.0-0.rc0.git1.1.fc22.armv7hl and it works.

Comment 4 Peter Robinson 2014-08-08 09:40:09 UTC

How does this affect devices that have say 256Mb of RAM that's not using displays? Does the RAM get reserved and hence not usable by the standard OS?

Comment 5 Stephen Warren 2014-08-08 15:53:18 UTC

The size of the CMA region is fixed at boot time; that's what this define does. This in turn affects the maximum amount of RAM that can be allocated using the dedicated CMA APIs.

IIUC, RAM in the CMA region can be used in two ways (1) allocated specifically by code using CMA APIs (e.g. the Tegra display driver) (2) if not already allocated as CMA, the unused pages can be used in the normal fashion by other kernel allocation routines. If CMA allocations come along later, I believe any other kernel use can be subject to paging, and so the RAM reclaimed for CMA. I suppose this means that certain non-pageable allocations couldn't be made from the CMA region, which might slightly increase certain types of memory pressure.

So, I don't believe this should affect other systems negatively, except perhaps that perhaps the CMA code might need marginally larger data structures (e.g. bitmaps) to manage a 64M rather than a 16M region.

Note that I'm not an expert on CMA at all, so perhaps you should double-check this.

Comment 6 Peter Robinson 2014-08-08 16:31:12 UTC

> So, I don't believe this should affect other systems negatively, except
> perhaps that perhaps the CMA code might need marginally larger data
> structures (e.g. bitmaps) to manage a 64M rather than a 16M region.

So I would like to have more confirmation before we change this.

http://cateee.net/lkddb/web-lkddb/CMA_SIZE_MBYTES.html

Note that this is the default set on boot. It can be overridden on the kernel command line or it seems can be specified as options in the device tree so it could be set on a per device or even presumably on a SoC basis.

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt

Comment 7 Stephen Warren 2014-08-08 19:54:56 UTC

I believe the mention of DMA in that reserved-memory.txt was an example of something that might use the binding. Certainly there appears to be zero code in the kernel that parses CMA size out of the DT (grep -HrnI linux,cma yields nothing on next-20140804).

Comment 8 Peter Robinson 2014-08-13 10:01:16 UTC

(In reply to Stephen Warren from comment #7)
> I believe the mention of DMA in that reserved-memory.txt was an example of
> something that might use the binding. Certainly there appears to be zero
> code in the kernel that parses CMA size out of the DT (grep -HrnI linux,cma
> yields nothing on next-20140804).

It still doesn't answer my questions about impact on devices other than tegra. At the moment I'm not convinced it won't impact other devices and even if it's an example you can still override on the kernel command line.

Comment 9 Hans de Goede 2014-08-13 11:32:45 UTC

Reading along here, (and I envision we will hit similar problems on sunxi in the future), I think that actually allowing to specify this in dt seems to be the most sensible option. Might be a hard sell to the dt folks since it is somewhat linux specific, but we could just call it it "reserved-mem-size-hint" or some such, and make it os agnostic in this way.

This really is something board specific, and as such belongs in the dt. Note board specific not soc specific, e.g. some boards may be headless, and then reserving a ton of memory for framebuffer use makes no sense.

Comment 10 Stephen Warren 2014-08-13 16:28:54 UTC

Alex Courbot wrote via email:

Sorry for not replying directly on the bug - I don't have an account on RH's bugzilla.

This LWN article explains how CMA works and why it should not affect systems with low memory and no display in a noticeable way:

https://lwn.net/Articles/447405/

"In other words, memory which is marked for use by CMA remains available to the rest of the system with the one restriction that it can only contain movable pages."

Hopefully this should convince the RH people that increasing the CMA size is harmless.

On ARM memory within the CMA range needs to have its identity map (or lowmem map) done using small pages instead of big pages ; but IIRC the switch to small pages only happens when the memory is actually allocated. I can dig that further if you want.

Hopefully the need to increase the CMA area size is only temporary. Once IOMMU support gets in, we might even not need CMA at all anymore.

Comment 11 Stephen Warren 2014-08-13 16:30:34 UTC

To me, specifying this in DT doesn't feel right, since it's not a HW-driven value.

I'm not sure what the following means:

> ... and even if it's an example you can still override on the kernel command line.

Do you mean you aren't convinced whether the cma= command-line option overrides the Kconfig default? That should be easy to check.

Comment 12 Peter Robinson 2014-08-13 16:32:22 UTC

> Hopefully this should convince the RH people that increasing the CMA size is
> harmless.

This has nothing to do with "RH people" ... at the moment I do ARM 100% in my own time! Please do not confuse contributors and Red Hat. They are not necessarily linked!

Comment 13 Hans de Goede 2014-08-14 09:24:18 UTC

(In reply to Stephen Warren from comment #11)
> To me, specifying this in DT doesn't feel right, since it's not a HW-driven
> value.

In a sense it is, e.g. a board with a high-res camera attached to its csi interface, and a non scatter gather capable isp, will need a significantly larger CMA area.

Where as on a board which does not have any peripherals with drivers which need CMA at all, we can make do with a  much smaller area.

This is simply not a case of one size fits all, on systems with say only 128M of memory, reducing the available RAM for non movable pages to 64M really is not acceptable. And on some systems (high res screen + high res camera) we may very well end up needing even more CMA then 64M.

As for the iommu thing, that will only help on systems which actually have an iommu.

Comment 14 Stephen Warren 2014-08-14 15:43:41 UTC

As mentioned by Alex, increasing the CMA size does not prevent other uses of that RAM. It's not dedicated to CMA.

Comment 15 Hans de Goede 2014-08-14 17:32:23 UTC

(In reply to Stephen Warren from comment #14)
> As mentioned by Alex, increasing the CMA size does not prevent other uses of
> that RAM. It's not dedicated to CMA.

It prevents the RAM from being used for certain uses (e.g. scattter dma buffers). Also if we need to set CMA for the worst case scenario, the default may well end up being bigger then the RAM on some boards. Really having a one size fits all default just is not going to work. This *is* a board specific setting.

Comment 16 Rob Clark 2014-08-15 11:16:17 UTC

(In reply to Stephen Warren from comment #14)
> As mentioned by Alex, increasing the CMA size does not prevent other uses of
> that RAM. It's not dedicated to CMA.

tbh, I don't think a 64m CMA pool on a 256m device is a good idea.  I think it would be much better to get the size from DT, so we can at least pick a value that makes sense on a per-board basis.  (Ie. how much ram it has, what size screen/camera resolution it can do, etc.)

Comment 17 Rob Clark 2014-08-15 12:13:28 UTC

fwiw, an example where per-board value is much better than a global (or even per-SoC-vendor) value:

On the bStem board I have, somehow the IOMMU doesn't work so I need to use CMA for all gpu buffers.. so I need a CMA pool of 256m to 348m (or more if someone wants some demanding games, but then we run into CMA vs highmem issues) if we had a global value, rather than per-board, we couldn't set cma=348m globally for all boards!

Comment 18 Nicolas Chauvet (kwizart) 2014-08-15 14:34:39 UTC

(In reply to Rob Clark from comment #17)
> fwiw, an example where per-board value is much better than a global (or even
> per-SoC-vendor) value:
> 
> On the bStem board I have, somehow the IOMMU doesn't work so I need to use
> CMA for all gpu buffers.. so I need a CMA pool of 256m to 348m (or more if
> someone wants some demanding games, but then we run into CMA vs highmem
> issues) if we had a global value, rather than per-board, we couldn't set
> cma=348m globally for all boards!

The original patch was not to address the issue where cma pool was not "adapted" to a given "usage". Instead it is meant to address the issue where a CMA capable graphical device (without arm-boot-config tweaking or else) would be allowed to boot into a graphical state in order to even do basic tasks.

I'm pleased that a per-board solution can be found if any. That will allow to enable the cma pool when needed. But given the amount of current support in the available dts, I would instead suggest that this discussion about the amount of cma is best suitable for each board to be done in the arm kernel mailing list.

The current problem is still that graphical boot on CMA capable device is broken in our development release. So I would like a solution to be found in the current time-line (before f21 GA or even alpha).

Current situation rely on CMA=64 by default. I don't understand why it is fundamentally different than a CMA=16 default, even on BeagleBone ?
Which feature would be broken on such device if the CMA is bumped from 16 to 64 ?

In others words:
cma=64 works everywhere
cma=16 breaks in some cases.

Comment 19 Rob Clark 2014-08-15 14:44:12 UTC

(In reply to Nicolas Chauvet (kwizart) from comment #18)
> In others words:
> cma=64 works everywhere
> cma=16 breaks in some cases.

If the only two possible options where cma=64 or cma=16 for all boards, I guess cma=64 would be better to at least more or less work on all devices out of the box.

But tbh I don't really understand the technical problem preventing per-board values, which seems like a better direction

Comment 20 Nicolas Chauvet (kwizart) 2014-08-15 15:48:30 UTC

(In reply to Rob Clark from comment #19)
...
> But tbh I don't really understand the technical problem preventing per-board
> values, which seems like a better direction
It's more a policy and moral issue. We cannot say "get your patches upstream" and at the same time prevent upstream solution to apply deferring the "real" solution to a later kernel.

The current kernel solution since 3.15 is CMA=64M, if a later solution is to provide a per-board value. Good, then we will apply (or backport) the solution at that time once accepted upstream.

Comment 21 Peter Robinson 2014-08-15 16:01:42 UTC

> It's more a policy and moral issue. We cannot say "get your patches

Moral? What? 

> The current kernel solution since 3.15 is CMA=64M, if a later solution is to
> provide a per-board value. Good, then we will apply (or backport) the
> solution at that time once accepted upstream.

You can specify it on the kernel command line to override the defaults and it's a perfectly valid interim solution for devices that need a larger CMA reservation than the default. It was set small originally because of the fact it's a default and can be overridden

Comment 22 Nicolas Chauvet (kwizart) 2014-08-15 16:29:15 UTC

(In reply to Peter Robinson from comment #21)

> You can specify it on the kernel command line to override the defaults and
We already explained why this is not satisfying. Things must just works.
That's only the case when the default is cma=64M, not 16M

Comment 23 Stephen Warren 2014-08-21 16:50:01 UTC

A couple of of ideas:

* Someone pointed me at a patch that set CMA to 10% of RAM (although IIRC the exact percentage was a Kconfig option). That would scale with RAM size, so presumably avoid any issues with large CMA on systems with small RAM.

* We could increase the CMA size only in the LPAE configuration. That'd be enough to get Jetson TK1 working, and presumably most new systems in the future. Presumably, anything with a new enough CPU to support LPAE would have enough RAM that 64M or more CMA would be just fine. Users would have to be able to know to use the LPAE-enabled kernel to run the installer though.

Would either of those options be more palatable?

Comment 24 Peter Robinson 2014-08-21 17:15:47 UTC

> Would either of those options be more palatable?

Both of those sound reasonable to me. I'm planning on dealing with tegra on lpae today/tomorrow so we can start initially with that.

Comment 25 Peter Robinson 2014-08-26 10:23:40 UTC

Initially I've applied CMA=64 to the lpae kernel to allow easier testing on the K1. I plan on testing the percentage option with 3.17rc2 across a number of devices before pushing that to the non lpae kernel

Comment 26 Fedora Update System 2014-09-06 12:46:03 UTC

kernel-3.16.2-300.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/kernel-3.16.2-300.fc21

Comment 27 Fedora Update System 2014-09-08 16:10:20 UTC

Package kernel-3.16.2-300.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.16.2-300.fc21'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-10312/kernel-3.16.2-300.fc21
then log in and leave karma (feedback).

Comment 28 Stephen Warren 2014-09-11 15:11:30 UTC

Running 3.17.0-0.rc4.git2.1.fc22.armv7hl+lpae, I've verified:
* CMA size is 64M in dmesg
* X starts OK (I just ran X and xterm)

Unfortunately, I don't have a Fedora account, so I don't think I can leave Karma.

Thanks.

Comment 29 Josh Boyer 2014-09-11 15:19:27 UTC

Karma isn't needed for rawhide (f22) kernels.  Thanks for verifying though.

Comment 30 Fedora Update System 2014-09-23 05:00:09 UTC

kernel-3.16.2-300.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.